Motherboard makers apparently to blame for high-end Intel Core i9 CPU failures | Ars Technica

Earlier this month, we wrote that some of Intel’s recent high-end Core i9 and Core i7 processors had been crashing and exhibiting other weird issues in some games and that Intel was investigating the cause.

An Intel statement obtained by Igor’s Lab suggests that Intel’s investigation is wrapping up, and the company is pointing squarely in the direction of enthusiast motherboard makers that are turning up power limits and disabling safeguards to try to wring a little more performance out of the processors.

“While the root cause has not yet been identified, Intel has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards,” the statement reads. “Intel has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency.”

Cossty,

Hardware Unboxed recently made a video and said that intel is mostly to blame, because they don’t have clearly defined defaults, because they want motherboards to “overclock” CPUs because it looks good on benchmarks.

narc0tic_bird,

From what I understood from Hardware Unboxed, running without hard power limits is essentially “supported” by Intel and motherboard manufacturers weren’t compelled to stick to the “recommended” power limits.

The fact that the new “Intel Baseline” profile that was pushed to motherboards via a BIOS update is vastly inconsistent between manufacturers leads be to believe that Intel doesn’t clearly state “do this and this as default”.

I find it a bit cheap to put the blame solely on motherboard manufacturers here.

There are also reports of instabilities with CPUs running at supposedly safe power limits. I can’t confirm this but I also wouldn’t be surprised if these power limits also caused silicon degradation at an unexpectedly fast pace.

RedWeasel,

What I found interesting is that the “Intel baseline” setting doesn’t seem to be the default. So if a builder sells a pc and manually sets it and the user needs to update/reset the settings to default, they will go back to unlimited.

quenpwn,
quenpwn avatar

Maybe the mobos vrm can't handle high oc?

monkeyman512,

My guess is the motherboard manufacturers could get away with this in the past without any issues. But Intel is pushing chips so close to redline out of the box that now it causes problems.

MonkderDritte,

Nah, just Intel shifting the blame.

sploosh,

If a bunch of enthusiasts are simply forgetting their basic post-OC troubleshooting and causing tons of RMAs… well that’s just funny.

gravitas_deficiency,

That’s not what this is.

uninvitedguest,
@uninvitedguest@lemmy.ca avatar

From what I’m reading these aren’t enthusiasts running an overclock, but rather stock settings from aggressive boards put out by manufacturers?

Buelldozer,
@Buelldozer@lemmy.today avatar

Yep, that’s the claim. Basically some enthusiast motherboards, cough cough asus cough cough, are shipping with stock settings that completely disable thermal throttling and allow essentially infinite voltage and current to the CPU! This wasn’t being done by people mucking with the settings after purchase this is how they were being shipped!

I’d surmise that it was being done so that those motherboards could be marketed as being faster than the competition.

GuStJaR,

I was having issues with crashes in multiple games but rdr2 was the worst. I had a rig built with an i9 14900k and Asus hero z790.

I think I finally found the solution and it was to do with the default bios settings for my Asus MB and my i9 14900k.

In the document linked here…

intel.com/…/13th-generation-intel-core-and-intel-…

Page 98, Table 17, Row 3: Reveals the stock turbo power limits for the 13900K and 14900K CPUs are 253W, not the 4,000+ my MB’s Bios settings default to. Page 184, Table 77, Row 6: Lists the maximum current limit at 307A, far below the MB’s default of 500+A.

I found this information in a Reddit post (reddit.com/…/optimizing_stability_for_intel_13900…) and followed the settings as follows:

ASUS Z790 Motherboards:

Save your current settings into a profile so you can return to them later if you want.

Reset your BIOS to default settings. Ai Tweaker tab:

Disable MultiCore Enhancement.

Enable XMP(if your RAM supports it).

Set SVID behavior to Typical Scenario.

Set short duration turbo power = 253

Set long duration turbo power = 253

Set max core/cache current = 307Amps

Doing this immediately stabilised the CPU temps as well as bring down the average temp by ~10 to 15c. It’s been a few months now with zero crashes.

Hope this helps someone

JackFrostNCola, (edited )

This is not a typo right, 307Amps?!
What creative maths have they done to get this number?

The PCB tracks on the motherboard are what, about 0.5mm thick and about 2mm wide (for the larger channels)? I can absolutely guarantee you arent getting 300+ Amps through those tracks.

Update: Thanks for the replies, it makes sense when dealing with these extremely low voltages and TIL a lot. Cheers!

RedWeasel,

That is 253watts at 1.21ish volts. Multiply those together and you get around 307. Divide 307 by 253 to get the exact voltage based on those number.

JackFrostNCola,

Thanks!

AnyOldName3,
@AnyOldName3@lemmy.world avatar

It’s a 250W+ part running at around 1V, so it’s going to draw a lot of current. Power is supplied via many pins on the back of the CPU, and they’re connected to many traces, so it’s not putting all that current through just one. It still puts out a lot of heat anyway, which is why modern motherboards have large heat sinks, sometimes with fans, on their VRMs.

JackFrostNCola,

Thanks!

TechNerdWizard42,

Oh but you are. It’s at 0.8v to 1.2v range so it’s high current.

This is what all the VRM design is for. The motherboards are generally 20-30 layers nowadays with 2oz copper in the power layers. The traces are short and you do get hundreds of amps.

And yes, I’ve designed them on the silicon side.

JackFrostNCola,

Thanks!

tal,
@tal@lemmy.today avatar

I’ve been reading news about this for a bit.

I believe that I may have damaged an i9-13900KF with stock Asus motherboard settings myself (though I can still make it work by disabling all but one core, sees constant problems now with multiple cores active).

If you’re getting one of these yourself, no joke, give serious consideration to using more-conservative-then-stock-motherboard settings.

paraphrand,

I never choose to mess with overclocking. This situation would have burned someone like me who assumes defaults are safer. What a mess.

tal,
@tal@lemmy.today avatar

Yeah, I could believe that there would be overclocking settings in a BIOS that would let you damage a CPU. I just was also thinking that whatever motherboard vendors chose as defaults wouldn’t. But, well, I suppose that their own qualification process might not be as rigorous as Intel’s.

Socsa,

In the past it has been considered pretty safe to play with a moderate OC because the CPUs have decent thermal protection built in. Seems like that era might be over.

Audalin,

Any guidance on choosing appropriate conservative settings for i7-13700K? I may be hit with the same as you in the future (sometimes I have to do some heavy multithreaded combinatorial computations which run several days with 100°C temperature, using all cores). The motherboard has options for customising pretty much everything there is, but I didn’t touch anything overclocking-related, so I have Asus defaults.

tal,
@tal@lemmy.today avatar

The article has a bunch of settings that they say that Intel’s flagged as “don’t use”. Intel will be a better source than me.

Audalin,

I see, thanks. Will check. I just thought perhaps you figured out something other than those from your experience.

fatalError,

Did thoes defaults include XMP though? XMP is also overclocking.

tal,
@tal@lemmy.today avatar

On my own motherboard, it is a default, but the article doesn’t list it as being a setting believed to be problematic from a CPU damage standpoint.

fatalError,

I guess not for your specific cpu, but Asus fried some ryzen 7000 cpus with XMP last year

  • All
  • Subscribed
  • Moderated
  • Favorites
  • technology@lemmy.world
  • DreamBathrooms
  • mdbf
  • ngwrru68w68
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • osvaldo12
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • InstantRegret
  • tacticalgear
  • anitta
  • ethstaker
  • modclub
  • cisconetworking
  • tester
  • GTA5RPClips
  • cubers
  • everett
  • megavids
  • provamag3
  • normalnudes
  • Leos
  • JUstTest
  • lostlight
  • All magazines