6.12. Half-Precision Floating Point
6.12 Half-Precision Floating Point
On ARM targets, GCC supports half-precision (16-bit) floating point via the __fp16
type. You must enable this type explicitly with the -mfp16-format
command-line option in order to use it.
ARM supports two incompatible representations for half-precision floating-point values. You must choose one of the representations and use it consistently in your program.
Specifying -mfp16-format=ieee
selects the IEEE 754-2008 format. This format can represent normalized values in the range of 2^-14 to 65504. There are 11 bits of significand precision, approximately 3 decimal digits.
Specifying -mfp16-format=alternative
selects the ARM alternative format. This representation is similar to the IEEE format, but does not support infinities or NaNs. Instead, the range of exponents