This is my approach in the reverse engineering of ECG file format saved by the ECG90A Electrocardiograph device, produced by Contec. The device can store an ECG case into the microSD card. During the ECG session, just choose the Store option as the print format, and then press the Start button.
Here you can find a Python program to draw PDF or PNG electrocardiograms from ECG files: ecg-contec GitHub repository.
Here you can find a review (in Italian) of the device: Elettrocardiografo Contec 90A.
Each file has the first 43 bytes allocated as an header to store some metadata:
Field name | Size | Note |
---|---|---|
Case name | 8 bytes | Example: 0000011 , null terminated. |
Unknown | 2 bytes | Seems to be always two null chars. |
Timestamp | 20 bytes | Example: 2020-11-12 18:09:05 , null terminated. |
Unknown | 2 bytes | Seems to be always two null chars. |
Name | 8 bytes | Name of the patient, null terminated. |
Sex | 1 byte | 0 = F, 1 = M, 255 = Blank |
Age | 1 byte | Cannot enter a number greather than 200. Zero if missing. |
Weight | 1 byte | Zero if missing. |
Just after the header, the file contains the data payload. The device stores 800 samples per second, i.e. data is acquired at 800 Hz rate. Each sample consists of eight (beware: 8 values, not 12!) 16-bit unsigned integers, in little-endian order.
This means that only eigth series of data are stored. This is a bit surprising because the device is presented as capable of running the standard 12-lead ECG and it can print all the 12 graphs. In the following table are listed the standard ECG 12 leads, the ones stored into the file have the label in gray:
I | Bipolar | Limb lead between the right arm and the left arm (red and yellow electrodes). |
II | Bipolar | Limb lead between the right arm and the left leg (red and green electrodes). |
---|---|---|
III | Bipolar | Limb lead between the left arm and the left leg (yellow and green electrodes). |
avR | Unipolar | Augmented voltage right, limb lead, red electrode. |
avL | Unipolar | Augmented voltage left, limb lead, yellow electrode. |
avF | Unipolar | Augmented voltage foot, limb lead, green electrode. |
V1 | Unipolar | Precordial lead (chest) V1 precordialprecordialwhite or red. |
V2 | Unipolar | Precordial lead (chest) V2 white or yellow. |
V3 | Unipolar | Precordial lead (chest) V3 white or green. |
V4 | Unipolar | Precordial lead (chest) V4 white or brown. |
V5 | Unipolar | Precordial lead (chest) V5 white or black. |
V6 | Unipolar | Precordial lead (chest) V6 white precordialor violet. |
My guess (to be confirmed) is that of the six limb leads, only the II and the III are actually stored and the remaining six data series, are the precordial ones. This means that the missing values (leads I, avR, avL and avF), are derived by some mathematical formulas.
My guess is supported by some tests I made: I copied an hand-crafted ECG file into the SD card of the ECG90A, where the values of the first two series were replaced by some constant values. If the file is viewed (replayed) on the device, all the first six series show as stright lines. If only the first serie is replaced with a constant value, the II lead is a stright line, if I replace the secon serie with a constant, only the III lead is a stright line.
So, my guess is that each sample is made up of 8 lead values, each of which is a 16-bit unsigned integer in little-endian order, for a total of 16 bytes:
II | III | V1 | V2 | V3 | V4 | V5 | V6 |
---|
If the device cannot measure a value (e.g. if an electrode is disconnected), you will find the hex value 0x6800 into the file.
Our tests show that the values, under normal conditions, vary in a range that goes approximately from a minimum of 1700 to a maximum of 2200. This means that the actual resolution of the analog to digital converter is far below the 16 bits allocated for each sample; it can be assesed instead to about a 10 bit value.
The unit of measure of the values seems to be 0.005 mV. This results from empirical measure of a variation of 200 units over the ECG plot, which resulted into a variation of 10 mm using a plot scale of 10 mm/mV.
It seems also that the values should be shifted by -2048 to obtain a zero-centered graph.
After the data payload, the file is closed by 37 bytes. Generally they are all zeros bytes, but in some cases I have found a different value at position 26 (counting from 0): 0x0116 or 0x0016. The different word is present in cases which does not have any obvious difference from the others.
So it seems that the end of data can be detected by an entire row of zeros (16 bytes), or just skipping the last 37 bytes of the file.
Here are the formulas to calculate the values for leads not included into the data series.
From the formulas of voltages of Limb_leads, we have:
I = LA - RA II = LL - RA III = LL - LA
This should mean that lead I can be calculated by:
I = II - III
The following formulas to calculate the unipolar limb leads are taken from the Goldberger’s Lead System, well explained in this article: ECG Lead Systems. See also the parsescp manual page. The formulas seems to be confirmed by the empirical observation of the graphs of the hand-crafted test file.
avR = 1/2 * III - II avL = 1/2 * II - III avF = 1/2 * (II + III)
0000037.ECG - This is a file downloaeded from the Contec ECG90A. It is a 10 seconds recording, where only the limb leads were attached; so only the first two data series (lead II and lead III) contain valid numbers. Beside the original file, the archive contains a file called 0000037.ECG.csv containing the same data converted in CSV format. The CSV contains all the six limb leads data, four of them are calculated with the formulas above. Finally the archive contains a file called 0000037.ECG.edf with the CSV data converted using EDFbrowser (with the included ecg90a.template), to be viewed in the EDFbrowser program itself.
This ECG proprietary format is explained above: it is used by the Contec ECG90A and may be by other devices from the same manufacturer.
The European Data Format (EDF) is a simple and flexible format for exchange and storage of multichannel biological and physical signals. It is an open format, here you can find an explained example. The open source program EDFbrowser is a viewer for that format. A limitation of this format is that it is not specifically targeted to ECG, so the existing viewers generally does not have specific functions required to view an ECG.
The SCP-ECG (Standard Communications Protocol for Computer-Assisted Electrocardiography) is the European standard for communication of resting ECGs. SCP-ECG (as per the accepted ISO standard 11073-91064:2009) is widely accepted even if limited to short term resting ECG. Free documentation about the format is scarce, even the historical ANSI/AAMI EC71:2001 is available only at astronomical price.
The format is rather complicated; it can contain raw or Huffman compressed data, it allows custom and multiple Huffman tables, data sequences can be actual values or first or second differences, can store QRS and rhythm data, rhythm data can be stored as plain values or as reference beat subtraction, bimodal compression can be used too. In the face of all these complications, only 65535 bytes can be stored for each data set, which means that this format is not suitable for recordings lasting more than thirty seconds!
Starting with version V3.0 (year 2014), the standard also provides support for the storage of continuous, long-term ECG recordings.
The HL7 aECG (the HL7 Annotated Electrocardiogram) is a standard created in response to a Food and Drug Administration’s digital electrocardiogram initiative. It was accepted by ANSI May, 2004.
Digital Imaging and Communications in Medicine (DICOM) - Since the year 2000 the widely used DICOM standard has included rules for diagnostic ECG waveforms, but for a long time, no ECG manufacturer had marketed electrocardiographs that support the DICOM waveform standard. It was spring 2006 before the first ECG manufacturer announced its adoption of the DICOM standard for diagnostic electrocardiographs. [4]. The only drawback of DICOM is the complexity of the standard that requires a developer to have a prior knowledge of DICOM philosophy [2].
To confirm the insane complexity of the standard, it is sufficient to say that it supports three different types of ECGs: 12-Lead ECG for shor term measures (1 to 13 channels, 200 to 1000 Hz, max 16384 samples), General ECG (1 to 24 channels, 200 to 1000 Hz), Ambulatory ECG for long term measures, e.g. Holter, etc (1 to 12 channels, 50 to 1000 Hz).