Let's Make Robots!

Arduino memory usage

All,

I am doing some tests to try to understand how C++ impacts available memory in the Arduino.  I am profiling some code I have written and trying

to understand the impact of allocating objects on the stack with the new command (which traditionally uses malloc or one of its derivations under

the hood - is that true with the gcc compiler?).  As I understand it, the Arduino has

32k program memory - the program size it shows in the UI for the Arduino - Binary sketch size: 14,218 bytes (of a 32,256 byte maximum)2k sram -

which shares the stack and any program memory allocated ( ie int t = 20 ; t is allocated here).
1k eeprom - can put storage here -- reading and writing will be slow to this memory -- not important to this testing since I want fast access.


I found the memoryTest() code somewhere online.  Offhand, this seems it should work and should be accurate, but not sure.  I know this would work

with a PC, but not sure with an Arduino.

// this function will return the number of bytes currently free in RAM
int memoryTest() {
  int byteCounter = 0; // initialize a counter
  byte *byteArray; // create a pointer to a byte array
  // More on pointers here: http://en.wikipedia.org/wiki/Pointer#C_pointers

  // use the malloc function to repeatedly attempt allocating a certain number of bytes to memory
  // More on malloc here: http://en.wikipedia.org/wiki/Malloc
  while ( (byteArray = (byte*) malloc (byteCounter * sizeof(byte))) != NULL ) {
    byteCounter++; // if allocation was successful, then up the count for the next try
    free(byteArray); // free memory after allocating it
  }
 
  free(byteArray); // also free memory after the function finishes
  return byteCounter; // send back the highest number of bytes successfully allocated
}


char * array = char(1024);


void setup()
{
  Serial.begin(9600) ;
 
  Serial.println(memoryTest());
}
 
void loop()
{

 
 
}

If I run this, I get 638 back from the memoryTest() call.  If I remove the array allocation, I get 1278 as available sram. I have tried a number

of different allocations using the C++ new command, and no matter what allocations I use, I either get back 638 or 1278 for available sram.  I am

sure it allocates memory in large chunks but was hoping there was an expert on how the gcc compiler manages memory.

a
char * array = char(2048);

definitely fails returns null as you would expect. 

char * array = char(1024);

will succeed but then the memoryTest() will return 1278 which would seem inaccurate.  Something weird going on here.  Any ideas?

Regards,

Bill

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I found this "freeRAM()" function a while ago maybe it can help?

http://jeelabs.org/2011/05/22/atmega-memory-use/

Will have to compare this versus what I am using and see if they agree.  Great to have in the developer's quiver as it were.  Thanks!

Thanks JerZ.  That's a neat little function, well thought out.  The only problem with it is it only tells you how much has NEVER been allocated.  If you allocate some and then free some, it doesn't take into account the freed memory, which can be used.  Still useful, and may be all that's needed.

BTW, finding ALL free memory is a near impossible task in C/C++ unless your program keeps track of it for you.

Another thought.  If you just want to know the smallest block size allocated, you can try allocating two bytes, one at a time, back to back, converting the pointers to int's and subtracting.  This is NOT guaranteed to work, but may be worth a shot. Like this:

int a = (int) new byte;

int b = (int) new byte;

int size = b-a;

Also, a couple notes on GCC and memory.  GCC does a LOT of optimizing, depending on version and command line flags.  It often won't even put a local variable on the stack.  The AVR has a lot of regs and GCC often keeps local vars in regs and never puts them on the stack.  If you want to play with standard GCC (not arduino) you can have it output the assembly and look at it with the -s (I think ) flag.

When you use malloc or new, the memory is allocated on the heap, not the stack.  The stack is strictly last in, first out.  The heap allows ANY order of allocations and deallocations DURING RUN TIME.  The compiler has no idea how that memory will be allocated / deallocated.  Using any type of static analysis will give at best a very crude idea of memory usage.  In general, the compiler allocates global variables (and statics) including objects at one end of memory, automatic (local) variables on the stack at the other end, and what's left if between is handled during run time by the memory manager for new/malloc.  The AVR LIBC (which arduino uses) manual has some discussion.  Here is a link to the stdlib (including malloc) portion.  The manual also has a section discussing the hazards of using C++ on AVRs in another section.

http://www.nongnu.org/avr-libc/user-manual/modules.html

In short, the compiler has no idea and no control over what your program does with malloc/new.  It is up to you (with a bit of help from the memory manager.)    This is always true in C/C++, but especially hazardous on small processors like AVR with little ram.

Just to clarify the above, the compiler doesn't control malloc/new memory allocation.  It is handled entirely by the library, in this case AVR LIBC.  I am not sure what algorithm they use, but as I mentioned there is some discussion in the manual and of course you could always look at the source code.

The numbers you are getting seem reasonable to me.  The C runtime needs some space, your variables need some, the arduino code adds quite a bit of unseen overhead, and what's left should be approximately the value you get back.  Remember also the allocator adds a few bytes to "mark" the block it is allocating so it knows what it is (size, etc) later when you free it.  Those bytes are typically "before" the pointer you get back.

all the help you have given  me.  You got me pointed in the right direction, and I think I have an answer of how memory allocation works on the Arduino.

First off, I realized that syntax was wrong for allocating an array. It should be

char * array = new char [1024] ;

instead of

char * array = new char (1024) ;

which calls the copy constructor for char.  I am a bit rusty with C++ so my apologies.  The correct syntax

char * array = new char [1024] ;

doesn't compile. 

SRLTest.cpp.o: In function `__static_initialization_and_destruction_0':
C:\Users\ADMINI~1.GIL\AppData\Local\Temp\build2936879128026788961.tmp/SRLTest.cpp:30: undefined reference to `operator new[](unsigned int)'

because it seems the gcc compiler has a subset of C++ functionality. 


So I have my class which probably allocates around 50 bytes of memory, but it has base classes and child objects with base classes so there are v-tables etc. 


With an empty Arduino application it shows 1278.


With this call:

MyAutonomousRobot * robot = new MyAutonomousRobot() ;

it shows 638.

with

MyAutonomousRobot * robot = new MyAutonomousRobot() ;
MyAutonomousRobot * robot2 = new MyAutonomousRobot() ;

it shows 638.

with

MyAutonomousRobot * robot = new MyAutonomousRobot() ;
MyAutonomousRobot * robot2 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot3 = new MyAutonomousRobot() ;

it shows 318.

with

MyAutonomousRobot * robot = new MyAutonomousRobot() ;
MyAutonomousRobot * robot2 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot3 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot4 = new MyAutonomousRobot() ;

it shows 318.

with

MyAutonomousRobot * robot = new MyAutonomousRobot() ;
MyAutonomousRobot * robot2 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot3 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot4 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot5 = new MyAutonomousRobot() ;

it shows 38.

MyAutonomousRobot * robot = new MyAutonomousRobot() ;
MyAutonomousRobot * robot2 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot3 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot4 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot5 = new MyAutonomousRobot() ;
MyAutonomousRobot * robot6 = new MyAutonomousRobot() ;

It never returns.  I assume it is stack overflow-crash and burn.


char * test = (char*) malloc(100);
char * test1 = (char*) malloc(100);
char * test2 = (char*) malloc(100);
char * test3= (char*) malloc(100);
char * test4 = (char*) malloc(100);

 MyAutonomousRobot * robot = new MyAutonomousRobot() ;

it shows 638. 

 char * test = (char*) malloc(100);
char * test1 = (char*) malloc(100);
char * test2 = (char*) malloc(100);
char * test3= (char*) malloc(100);
char * test4 = (char*) malloc(100);
char * test5 = (char*) malloc(100);

MyAutonomousRobot * robot = new MyAutonomousRobot() ;

it shows 318

The first allocation takes me from 1278 to 638.  I think it allocates a large chunk and then parses it out to the next new calls. It looks like each of my robot allocations uses a little over a 100 bytes which seems right.

This is some of the testing on the Standard Robotics Library proof of concept that I posted about, and you were kind enough to comment on a while back.  I will be very interested in what you think of the overall design when I am ready.  Speed wise my tests have been very good; my design added 2 microseconds to the scan time on a simple robot, but as you can tell, the memory footprint might be too large for an Arduino with a complex problem.  More testing is required.

My next step is to wire this up to a simple robot.  I am not sure how much stack space I will need so will keep allocating memory and running tests until the program starts to run incorrectly. This embedded stuff sure is fun...

As always, thanks for your help.

Regards,

Bill

 

 

Mostly, anyway.  I think the allocator hands out large blocks then breaks them up.  Now that you mention it I seem to recall reading something similar to that in AVR LIBC.  I'm going off a bad memory that may have different things jumbled together and just plain wrong, but I seem to recall that it uses the allocator (name?) that hands out powers of two (1024, 512,..., 32) and breaks them up when needed.

A couple things you can do to save space:  allocate local variables when you can instead of new/malloc.  Or static/global if they are not going to go away.  That will save overhead of new/malloc and keep memory from fragmenting.  Also saves you from having to keep track of what's in the heap.  You can also allocate arrays of objects when you need more than one.  Again, saves the overhead which is I believe 2 bytes per allocation (may be 4?).  You might also do your own memory management.  A custom manager tuned just to your needs "might" be worthwhile in some cases.  If you know how the app will use the memory you can make it specific.  You can do this for just some types of objects, too.  But unless you get desperate it probably isn't worth the trouble.

I'll take a Michelob, ice cold thank you :-)  Actually, I should buy you one for exercising some long forgotten neurons.  But, then again, helping each other is why we are here, isn't it?

Good luck.  Can't wait to see what you come up with.  I think it will be quite interesting.

The beer is in the fridge.  If you are ever out toward NH or MA, let me know.  We are here to help each other but it certainly has been a one way street so far.  The learning curve is pretty steep on robotics, no matter what anyone says.

The one thing I remember about embedded development from college is that you are supposed to allocate everything you will ever need up front as part of your setup whenever possible.  That way if there is a problem with memory, you will know right away rather than having some mysterious failures later on and avoid those pesky memory fragmentation issues.  With that one statement, I have exhausted all of my knowledge of embedded dev!

In my design, everything is created up front.  There are a few variables created on the stack of course as part of processing, but minimized where possible.  So far of course, I am doing simple things so it is easy to not allocate anything except at the beginning.  Will need to scale it out from proof of concept to more complex algorithms and see how things work but with my design, I think it will hold up pretty much that way.

I had forgotten about the overhead with using new or malloc.  I will have to go back and make a few changes to see if I can minimize my use of that.  Most of my classes have overloaded constructors to ensure they are properly setup although could move to an Initialize (...) method with all of that passed in via parameters but that is probably more expensive than using new to begin with.  Things to keep in mind though moving forward.

Memory manager - I hope I don't have to move to that.  That sounds like tedious work fraught with error.

Someone posted code with using timer interrupts this week which got me thinking about using that as a scheduler.  If that happens it would be version 2 or 3.  It would be a poor man's realtime os but of course not portable.

Thanks again.  I will post my library code in the next few weeks.

Regards,

Bill