Analysis of information sources in references of the Wikipedia article "Bfloat16 floating-point format" in English language version.
The bfloat16 standard is a targeted way of representing numbers that give the range of a full 32-bit number, but in the data size of a 16-bit number, keeping the accuracy close to zero but being a bit more loose with the accuracy near the limits of the standard. The bfloat16 standard has a lot of uses inside machine learning algorithms, by offering better accuracy of values inside the algorithm while affording double the data in any given dataset (or doubling the speed in those calculation sections).
Uses the non-IEEE Round-to-Odd rounding mode.
All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts
{{cite report}}
: CS1 maint: multiple names: authors list (link)This custom floating point format is called "Brain Floating Point Format," or "bfloat16" for short. The name flows from "Google Brain", which is an artificial intelligence research group at Google where the idea for this format was conceived.
This page lists the TensorFlow Python APIs and graph operators available on Cloud TPU.
On TPU, the rounding scheme in the conversion is round to nearest even and overflow to inf.
All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts
{{cite report}}
: CS1 maint: multiple names: authors list (link)Google invented its own internal floating point format called "bfloat" for "brain floating point" (after Google Brain).
Converts float number to nv_bfloat16 precision in round-to-nearest-even mode and returns nv_bfloat16 with converted value.
For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision.
Intel said that the NNP-L1000 would also support bfloat16, a numerical format that's being adopted by all the ML industry players for neural networks. The company will also support bfloat16 in its FPGAs, Xeons, and other ML products. The Nervana NNP-L1000 is scheduled for release in 2019.
Intel plans to support this format across all their AI products, including the Xeon and FPGA lines
...Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs.
For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision.
In many models this is a drop-in replacement for float-32