The debates around DeepSeek are intense - US vs. China, big vs. small models, open vs. closed source, and the shockingly efficient architecture it represents. Pride, fear, disbelief, disgust - all these emotions have clouded the facts. A few personal thoughts:Thoughts on Training Costs: 1⃣ $6M Training Costs = Plausible IMOQuick math: Training costs ∝ (active params * tokens). DeepSeek v3 (37B params; 14.8T tokens) vs. Llama3.1 (405B params; 15T tokens) = v3 theoretically should be 9% of Llama3.1's cost. And the disclosed actual figures aligned with this back-of-the-envelope math, meaning, the number are directionally believable.ImagePlus, there was no hiding, the footnote clearly said: “the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associa