TechTakes

1750 readers

57 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

DeepSeek roundup: banned by governments, no guard rails, lied about its training costs (pivot-to-ai.com)

submitted 1 month ago by dgerard@awful.systems to c/techtakes@awful.systems

56 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] leisesprecher@feddit.org 36 points 1 month ago (20 children)

Even if they greatly underreported costs and their services are banned: the models are out there, open source and way more efficient than anything Meta and OpenAI could produce.

So it's pretty obvious that the tech giants are burning money for mediocre output.

[–] tyler@programming.dev -2 points 1 month ago (9 children)

I’m very confused by this, I had the same discussion with my coworker. I understand what the benchmarks are saying about these models, but have any of y’all actually used deepseek? I’ve been running it since it came out and it hasn’t managed to solve a single problem yet (70b param model, I have downloaded the 600b param model but haven’t tested it yet). It essentially compares to gpt-3 for me, which only cost OpenAI like $4-9 million to train (can’t remember the exact number right now).

I just do not see the “efficiency” here.

[+] Ksin@lemmy.world -11 points 1 month ago (2 children)

The 70b model is a distilation of Llama3.3, that is to say it replicates the output of Llama3.3 while using the deepseekR1 architecture for better processing efficiency. So any criticism of the capability of the model is just criticism of Llama3.3 and not deepseekR1.

[–] bitofhope@awful.systems 12 points 1 month ago

Thank you for shedding light on the matter. I never realized that 69b model is a pisstillation of Lligma peepee point poopoo, that is to say it complicates the outpoop of Lligma4.20 while using the creepbleakR1 house design for better processing deficiency. Now I finally realize that any criticism of Kraftwerk's 1978 hit Das Model is just criticism of Sugma80085 and not deepthroatR1.

[–] froztbyte@awful.systems 9 points 1 month ago

[to the tune of Fort Minor's Remember The Name]

10% senseless, 20% post
15% concentrated spirit of boast
5% reading, 50% pain
and a 100% reason to not post here again

load more comments (6 replies)

load more comments (16 replies)