130 lines
7.5 KiB
Plaintext
130 lines
7.5 KiB
Plaintext
https://elixir.bootlin.com/linux/latest/source/kernel/gcov/gcc_4_7.c
|
|
https://github.com/gcc-mirror/gcc/blob/master/gcc/gcov-io.c
|
|
https://github.com/gcc-mirror/gcc/blob/master/gcc/gcov-io.h
|
|
https://stackoverflow.com/questions/36839354/generating-gcda-coverage-files-via-qemu-gdb
|
|
http://blog.techveda.org/howsourcedebuggerswork/
|
|
|
|
Coverage information is held in two files. A notes file, which is
|
|
generated by the compiler, and a data file, which is generated by
|
|
the program under test. Both files use a similar structure. We do
|
|
not attempt to make these files backwards compatible with previous
|
|
versions, as you only need coverage information when developing a
|
|
program. We do hold version information, so that mismatches can be
|
|
detected, and we use a format that allows tools to skip information
|
|
they do not understand or are not interested in.
|
|
Numbers are recorded in the 32 bit unsigned binary form of the
|
|
endianness of the machine generating the file. 64 bit numbers are
|
|
stored as two 32 bit numbers, the low part first. Strings are
|
|
padded with 1 to 4 NUL bytes, to bring the length up to a multiple
|
|
of 4. The number of 4 bytes is stored, followed by the padded
|
|
string. Zero length and NULL strings are simply stored as a length
|
|
of zero (they have no trailing NUL or padding).
|
|
int32: byte3 byte2 byte1 byte0 | byte0 byte1 byte2 byte3
|
|
int64: int32:low int32:high
|
|
string: int32:0 | int32:length char* char:0 padding
|
|
padding: | char:0 | char:0 char:0 | char:0 char:0 char:0
|
|
item: int32 | int64 | string
|
|
The basic format of the notes file is
|
|
file : int32:magic int32:version int32:stamp int32:support_unexecuted_blocks record*
|
|
The basic format of the data file is
|
|
file : int32:magic int32:version int32:stamp record*
|
|
The magic ident is different for the notes and the data files. The
|
|
magic ident is used to determine the endianness of the file, when
|
|
reading. The version is the same for both files and is derived
|
|
from gcc's version number. The stamp value is used to synchronize
|
|
note and data files and to synchronize merging within a data
|
|
file. It need not be an absolute time stamp, merely a ticker that
|
|
increments fast enough and cycles slow enough to distinguish
|
|
different compile/run/compile cycles.
|
|
Although the ident and version are formally 32 bit numbers, they
|
|
are derived from 4 character ASCII strings. The version number
|
|
consists of a two character major version number
|
|
(first digit starts from 'A' letter to not to clash with the older
|
|
numbering scheme), the single character minor version number,
|
|
and a single character indicating the status of the release.
|
|
That will be 'e' experimental, 'p' prerelease and 'r' for release.
|
|
Because, by good fortune, these are in alphabetical order, string
|
|
collating can be used to compare version strings. Be aware that
|
|
the 'e' designation will (naturally) be unstable and might be
|
|
incompatible with itself. For gcc 17.0 experimental, it would be
|
|
'B70e' (0x42373065). As we currently do not release more than 5 minor
|
|
releases, the single character should be always fine. Major number
|
|
is currently changed roughly every year, which gives us space
|
|
for next 250 years (maximum allowed number would be 259.9).
|
|
A record has a tag, length and variable amount of data.
|
|
record: header data
|
|
header: int32:tag int32:length
|
|
data: item*
|
|
Records are not nested, but there is a record hierarchy. Tag
|
|
numbers reflect this hierarchy. Tags are unique across note and
|
|
data files. Some record types have a varying amount of data. The
|
|
LENGTH is the number of 4bytes that follow and is usually used to
|
|
determine how much data. The tag value is split into 4 8-bit
|
|
fields, one for each of four possible levels. The most significant
|
|
is allocated first. Unused levels are zero. Active levels are
|
|
odd-valued, so that the LSB of the level is one. A sub-level
|
|
incorporates the values of its superlevels. This formatting allows
|
|
you to determine the tag hierarchy, without understanding the tags
|
|
themselves, and is similar to the standard section numbering used
|
|
in technical documents. Level values [1..3f] are used for common
|
|
tags, values [41..9f] for the notes file and [a1..ff] for the data
|
|
file.
|
|
The notes file contains the following records
|
|
note: unit function-graph*
|
|
unit: header int32:checksum string:source
|
|
function-graph: announce_function basic_blocks {arcs | lines}*
|
|
announce_function: header int32:ident
|
|
int32:lineno_checksum int32:cfg_checksum
|
|
string:name string:source int32:start_lineno int32:start_column int32:end_lineno
|
|
basic_block: header int32:flags*
|
|
arcs: header int32:block_no arc*
|
|
arc: int32:dest_block int32:flags
|
|
lines: header int32:block_no line*
|
|
int32:0 string:NULL
|
|
line: int32:line_no | int32:0 string:filename
|
|
The BASIC_BLOCK record holds per-bb flags. The number of blocks
|
|
can be inferred from its data length. There is one ARCS record per
|
|
basic block. The number of arcs from a bb is implicit from the
|
|
data length. It enumerates the destination bb and per-arc flags.
|
|
There is one LINES record per basic block, it enumerates the source
|
|
lines which belong to that basic block. Source file names are
|
|
introduced by a line number of 0, following lines are from the new
|
|
source file. The initial source file for the function is NULL, but
|
|
the current source file should be remembered from one LINES record
|
|
to the next. The end of a block is indicated by an empty filename
|
|
- this does not reset the current source file. Note there is no
|
|
ordering of the ARCS and LINES records: they may be in any order,
|
|
interleaved in any manner. The current filename follows the order
|
|
the LINES records are stored in the file, *not* the ordering of the
|
|
blocks they are for.
|
|
The data file contains the following records.
|
|
data: {unit summary:object summary:program* function-data*}*
|
|
unit: header int32:checksum
|
|
function-data: announce_function present counts
|
|
announce_function: header int32:ident
|
|
int32:lineno_checksum int32:cfg_checksum
|
|
present: header int32:present
|
|
counts: header int64:count*
|
|
summary: int32:checksum {count-summary}GCOV_COUNTERS_SUMMABLE
|
|
count-summary: int32:num int32:runs int64:sum
|
|
int64:max int64:sum_max histogram
|
|
histogram: {int32:bitvector}8 histogram-buckets*
|
|
histogram-buckets: int32:num int64:min int64:sum
|
|
The ANNOUNCE_FUNCTION record is the same as that in the note file,
|
|
but without the source location. The COUNTS gives the
|
|
counter values for instrumented features. The about the whole
|
|
program. The checksum is used for whole program summaries, and
|
|
disambiguates different programs which include the same
|
|
instrumented object file. There may be several program summaries,
|
|
each with a unique checksum. The object summary's checksum is
|
|
zero. Note that the data file might contain information from
|
|
several runs concatenated, or the data might be merged.
|
|
This file is included by both the compiler, gcov tools and the
|
|
runtime support library libgcov. IN_LIBGCOV and IN_GCOV are used to
|
|
distinguish which case is which. If IN_LIBGCOV is nonzero,
|
|
libgcov is being built. If IN_GCOV is nonzero, the gcov tools are
|
|
being built. Otherwise the compiler is being built. IN_GCOV may be
|
|
positive or negative. If positive, we are compiling a tool that
|
|
requires additional functions (see the code for knowledge of what
|
|
those functions are).
|