chipKIT® Development Platform

Inspired by Arduino™

Double Vs. Float with chipKIT IDE

Created Sun, 17 Jul 2011 14:04:22 +0000 by Gabriel


Sun, 17 Jul 2011 14:04:22 +0000

Hi everybody,

To quote the Arduino site, "Unlike other platforms, where you can get more precision by using a double (e.g. up to 15 digits), on the Arduino, double is the same size as float. "

Does this fact carry over to the chipKIT IDE?

I have a chipKIT Uno32, and so far my attempts to utilize double precision haven't yielded results that are any different than when using floating point numbers.

Here is the code I'm using if it helps. It calculates the Julian Day and is the first step in a fairly lengthy algorithm which calculates the current position of the sun.

double year, month, day, timezone, hour, minute;
         double m,d,a,b,jd;
         year = 2011;
         timezone = -5;
         month = 1;
         m = month;
	 d = day + ((hour + (-1 * timezone))/24.0)  + minute/1440.0; 
	 if(m > 2){
	  year = year-1.0;
	  m = m+12.0;
	 a = double(long(year/100.0));
	 b = 2.0 - a + double(long(a/4.0));
	 jd = double(long(365.25 * (year+4716.0))) + double(long(30.6001 * (month+1.0))) + d + b + -1524.5 ;
         //Serial.println("JD Output");
         //Serial.println(double(long(365.25 * (year+4716.0))),7);
         //Serial.println(double(long(30.6001 * (month+1.0))),7);

The same program ran inside Matlab on my laptop outputs 2455195.7506944.
The above code gives 2455195.7500000 as the answer regardless of whether or not I use floats or doubles, or even a regular Arduino Vs. the ChipKIT Uno32.

It doesn't look like using double makes any difference here. Am I implementing it incorrectly, or is it just that the IDE is not set up for it.

Any advice would be much appreciated.

Thank you so much for your time, Gabriel

P.S. Before anyone asks, yes that 0.0006944, believe it or not, does make a fairly significant difference in the final altitude and azimuth output of the sun. :shock:


Sun, 17 Jul 2011 15:59:00 +0000

I don't know the algorithm so this might be an invalid question, but are you sure all those "long" casts belong there?

For example, in the line:

a = double(long(year/100.0));

If the year is 2011 the value of 'a' will be something like 20, whereas if you take out the long it will be 20.11.

Your use of the type cast will convert the floating point representation of year/100.0 to a long decimal, then convert that to a double. You'll have lost the fractional part in the process.

Is that what you meant?


Sun, 17 Jul 2011 16:53:45 +0000

Ah yes, I should have mentioned it, but the long casts are indeed intentional. I've never tried to actually understand how the algorithm works, I just copied it from the book Astronomical Algorithms by Jean Meeus. I've had lots of success with the code before, it just doesn't want to work on the chipKIT quite like I had hoped.

If I actually print and compare the output of just parts of the final jd calculation, they do match the output of my computer program exactly. You can see the commented out code towards the bottom where I checked it.

That last line though (jd = ...) where all the individual parts combine results in a number which seems to just be too big for the chipKIT.

I have put together other sun tracking programs for the Arduino which do work OKish, but they use tables to help with the heavy lifting and aren't really as accurate as I would like. The calculated altitude and azimuth is barely within a degree during certain time of the year.

If you Google "Open Source Arduino Sun Tracking / Heliostat Program", you will find what I'm working on. I'd link it, but I think I'm still to new here for that.

Thanks for your reply! Gabriel


Sun, 17 Jul 2011 18:11:36 +0000

If the purpose of those long casts is to truncate the fractional part, may I recommend using floor() function? The way it is now, it looks really hackish.


Mon, 18 Jul 2011 04:29:11 +0000

** deleted by author **

I'd incorrectly assumed that all fractional parts were stripped away by the (long) casts. In fact, variable 'd' still contains a fraction.

Sorry if you read this between its posting and subsequent correction.

The fractional part of 'd' is .25069444., which when combined with the fractional .5 in the calculation of 'jd' should indeed result in the required fraction of .75069444.

Now I'm puzzled.


Mon, 18 Jul 2011 12:44:52 +0000

In 32-bit systems, sizeof(double) == 8, whereas sizeof(long) == 4. What happens when we strip good 4 bytes of a double value?


Mon, 18 Jul 2011 12:49:23 +0000

Hmmmm ... where in the code might that take effect?


Mon, 18 Jul 2011 13:41:04 +0000

Now that I think of it, nowhere, my bad.

The result of double(long(2011/100.0)), same as (double)((long)(2011/100.0)) is 20.0. There's no fractional part.

Does this algorithm produce correct result on a desktop system? If you don't have a C++ compiler set up, you can make a simple test at and run it in browser.


Mon, 18 Jul 2011 23:20:05 +0000

If the purpose of those long casts is to truncate the fractional part, may I recommend using floor() function? The way it is now, it looks really hackish.

Thanks I had actually been looking for a better way of doing that.

I have noticed that even if I do something simple like this,

jd = 2455195.7506944;

and then print it, the output is still 2455195.7500000.

I also tried this, Serial.println(sizeof(double)); after reading svofski's earlier post and it returns 4 which to me implies that double is really just a float.

It looks like even though the chipKIT Uno 32 is 32 bit and is presumably capable of double precision, the IDE isn't yet set up for it.

Unless I'm missing something, it looks like I'm going to be stuck with this fact. Maybe it is time to look into yet another microcontroller. :)

Thanks again for everyone's input! Gabriel


Mon, 18 Jul 2011 23:31:46 +0000

There is something wrong, somewhere.

void setup () {
  Serial.begin (9600);
  double d = 0.01234567;
  Serial.print (d,7);
void loop () {

The output was 0.0123457, which is correctly rounded at the 7th digit.

I then ran this code with your number (2455195.7506944) and got the output value 2455195.7500000, exactly as you are seeing.

Time for some deeper digging.


Mon, 18 Jul 2011 23:48:53 +0000

I pasted the following code into the relevant Print.cpp function:

char buf[128];
sprintf (buf,"%f",number);
println (buf);

Got the same problem.

Then I had a flash of inspiration:

#include <math.h>

void setup () {
  Serial.begin (9600);
  double d = 2455195.7506944;
  Serial.println (sizeof(d));
void loop () {

Guess my answer? Sadly, sizeof(d)=4.

OK, where is my double??


Tue, 19 Jul 2011 00:06:51 +0000

OK, I give up.

I found a mention that a DOUBLE on Pic32 is 4 bytes, and that a LONG DOUBLE is 8 bytes.

I therefore modified the example to use a LONG DOUBLE. It failed to compile because the library functions didn't support this data type.

I modified all of the relevant library functions to use LONG DOUBLE instead of DOUBLE. Now I compile, and the sizeof(d) = 8.

#include <math.h>

void setup () {
  Serial.begin (9600);
  long double d = 2455195.7506944;
  Serial.println (sizeof(d));
  Serial.println (d,7);
void loop () {

BUT ... but ... but ... Serial.println (d,7) = "2455195.7500000" !!

Here's the test:

void Print::printFloat(long double number, uint8_t digits)
	// Handle negative numbers
	if (number < 0.0)
		 number = -number;

	**char buf[128];
	sprintf (buf,"%lf",number);
	println (buf);**


What might we have missed in this thread? Based on this behaviour, there do appear to be problems with double-precision floating point numbers.


Tue, 19 Jul 2011 07:50:43 +0000

You should go deeper into the implementation of sprintf(), that's the function that does the real job.

So double in this compiler is fake.. Why indeed, I wonder. Anyway, forewarned is forearmed. Good that now we know.


Tue, 19 Jul 2011 08:41:23 +0000

For clarity, the use of sprintf() in this function is mine. The standard Arduino Print.cpp library doesn't use this code, I added those three lines for an independent check on what was being passed to the routine.

Print.cpp has its own digit extraction and printing logic that follows at the point where I wrote snip in the code sample.


Tue, 19 Jul 2011 09:47:29 +0000

To be absolutely sure that it isn't the formatting that's causing the problem you could dump the hex values of those 8 bytes and re-construct the value by hand to check. I've found some float limitations when using GCC on the ARM processor, I don't know what options this MIPS version of GCC was built with, anyone know how to find out?

Does the MIPS 4K processor support float in hardware at all, or is this all soft-float?


Tue, 19 Jul 2011 10:17:47 +0000

All soft..


Tue, 19 Jul 2011 16:12:34 +0000

You might try adding the -fno-short-double option to the compiler and linker command lines in platforms.txt. Then, you can continue to use doubles rather than long doubles.


Tue, 02 Aug 2011 23:49:09 +0000

We are considering changing the default size of double for the chipKIT compiler to 8 bytes. In other words, we'd make -fno-short-double the default. Opinions?


Wed, 03 Aug 2011 01:23:16 +0000

that's what I'd expect.

Alan KM6VV

We are considering changing the default size of double for the chipKIT compiler to 8 bytes. In other words, we'd make -fno-short-double the default. Opinions?


Wed, 03 Aug 2011 01:37:19 +0000

We are considering changing the default size of double for the chipKIT compiler to 8 bytes. In other words, we'd make -fno-short-double the default. Opinions?

Sounds like a good idea to me. Having true double precision would be big help to someone like me who enjoys doing a lot of heavy math "stuff" in their programs.

As mentioned above, it is what I would expect double to be too.


Wed, 03 Aug 2011 02:58:47 +0000

We are considering changing the default size of double for the chipKIT compiler to 8 bytes. In other words, we'd make -fno-short-double the default. Opinions?

That makes perfect sense to me as well. That way, one can use 'float' or 'double' depending on what they need, and is what just about everyone would assume is the case.


Wed, 03 Aug 2011 06:45:35 +0000


As per integer types, shouldn't be using a more explicit naming :?:

int8_t, uint8_t (unsigned char, byte), int16_t (int), uint16_t (unsigned int, word), int32_t (long), uint32_t (unsigned long), ...

float32 (float), float64 (double), float80 (long double) (*), ...

Reference: — () not included in*


Wed, 03 Aug 2011 09:21:52 +0000

Would that mean that you'd have to remember additional function names to do single precision math (atan2f()), and that intermediate floating point results would all be done in double precision even with single precision operands? I'm not opposed to double being 8bytes, but I'm not sure I want to give up the ability to easily restrict my computations to 4bytes (which I think is C's fault by default?)

How does 8-byte float performance compare to 4-byte on PIC32 using the PIC libraries?


Wed, 03 Aug 2011 10:06:50 +0000


You're right about the type names complexity.

Here's what the same code produces on Arduino and chipKIT: [attachment=0]About Types - Arduino - chipKIT.png[/attachment]

For that very reason I raised the question about the :arrow: Arduino / chipKIT Compatibility Strategy.

#include <stdint.h>

short s;
byte b;
int i;
long l;
word w;
unsigned long ul;

uint8_t ui8;
int8_t i8;
uint16_t ui16;
int16_t i16;
int32_t i32;
uint32_t ui32;

double d;
float f;
long double ld;

void setup() {
Serial.print("\n\n\n*** About Types\n");

#if defined(__AVR_ATmega168__) || defined(__AVR_ATmega328P__)
Serial.print("\t Arduino \n");

#if defined(__PIC32MX__) 
Serial.print("\t chipKIT\n");

Serial.print("byte \t"); Serial.print(sizeof(b)); Serial.print("\n");
Serial.print("word \t"); Serial.print(sizeof(w)); Serial.print("\n");

Serial.print("short \t"); Serial.print(sizeof(s)); Serial.print("\n");
Serial.print("int \t"); Serial.print(sizeof(i)); Serial.print("\n");
Serial.print("long \t"); Serial.print(sizeof(l)); Serial.print("\n");
Serial.print("unsigned long \t"); Serial.print(sizeof(ul)); Serial.print("\n");

Serial.print("uint8_t \t"); Serial.print(sizeof(ui8)); Serial.print("\n");
Serial.print("int8_t \t"); Serial.print(sizeof(i8)); Serial.print("\n");
Serial.print("uint16_t \t"); Serial.print(sizeof(ui16)); Serial.print("\n");
Serial.print("int16_t \t"); Serial.print(sizeof(i16)); Serial.print("\n");
Serial.print("uint32_t \t"); Serial.print(sizeof(ui32)); Serial.print("\n");
Serial.print("int32_t \t"); Serial.print(sizeof(i32)); Serial.print("\n");

Serial.print("float \t"); Serial.print(sizeof(f)); Serial.print("\n");
Serial.print("double \t"); Serial.print(sizeof(d)); Serial.print("\n");
Serial.print("long double \t"); Serial.print(sizeof(ld)); Serial.print("\n");


void loop() {


Wed, 03 Aug 2011 10:40:21 +0000

I'm for 8-byte doubles, because it's more sane that way. And Arduino legacy crutches don't have to be suffered forever.

Doubles are pretty rare and usually the code that uses doubles in micros belongs to people who never cared about the binary representation of doubles anyway. For them some extra precision may be just a pleasant surprise.


Wed, 03 Aug 2011 12:21:27 +0000

And when talking about compatibility, let's also not forget that compatibility with C/C++ is equally important as Arduino. When dealing with hardware I can accept having to deal with register names and tricks that are non-standard, but when it comes to "normal" code, especially floating point, compatibility with C/C++ is as important to me as anything else.

These boards are very powerful, especially the MAX, and sooner or later people will get to the point of implementing (or port existing) math algorithms (filters, ffts, etc.) and having full compatibility with IEEE-style math will be priceless.

So having float = 4, double =8, and letting gcc handle them according to the C/C++ standard for promotion/conversion/type casting is almost mandatory.


Wed, 03 Aug 2011 17:46:46 +0000

How does 8-byte float performance compare to 4-byte on PIC32 using the PIC libraries?

There would definitely be both a performance and a code-size hit, but I don't have specific number. The performance and code size would also change when we move to a different standard C library.

Yes, you'd also have to use the single-precision function names when calling things like arctan2f(). You would not see a functional problem when calling the double precision version of the function, probably just code size and performance.

I think the community has been generally in favor of the change.