Machine Check Events: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
 
(3 intermediate revisions by the same user not shown)
Line 2: Line 2:




A machine check exception refers to faults that the processor detects and signals.
Machine check exceptions refer to faults that the processor detects and signals.
 
Which will frequently be about faulty hardware.
Which will frequently be about faulty hardware.


Line 8: Line 9:
Whether it's a warning or error varies.
Whether it's a warning or error varies.


You'll probably see more warnings, just because they will get logged in a still-running system,  
You'll probably see more warnings,  
while various (fatal) errors hang the system, and at best be shown on screen at that moment.
if only because they will get logged in a still-running system,  
while various (fatal) errors hang the system, and at best be shown on screen only at that moment.




Line 32: Line 34:


See [[Linux_admin_notes_-_health_and_statistics#EDAC]]
See [[Linux_admin_notes_-_health_and_statistics#EDAC]]
[[Category:Computer‏‎]]
[[Category:Hardware]]

Latest revision as of 17:04, 31 January 2024

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Machine check exceptions refer to faults that the processor detects and signals.

Which will frequently be about faulty hardware.


Whether it's a warning or error varies.

You'll probably see more warnings, if only because they will get logged in a still-running system, while various (fatal) errors hang the system, and at best be shown on screen only at that moment.


You're probably here because you saw syslog entries like:

[Hardware Error]: Machine check events logged


For more detail, look at things like the mcelog package, and its logfile, e.g. /var/log/mcelog

These are often warnings, but often also warnings you want to know about.

For example, in my case the CPU was being throttled because it was overheating (~90C).


See also:

http://www.mcelog.org/faq.html


CMCI storm detected: switching to poll mode

See Linux_admin_notes_-_health_and_statistics#EDAC