Memory Allocators 101 - Write a simple memory allocator

Memory Allocators 101 - Write a simple memory allocator - Arjun Sreedharan

  • Blog
  • About
  • Contact
  • Posts

Google+LinkedInGithubFacebookTwitterUMass Amherst

1:11 AM 9th 八月 20160 notes

  1. Memory Allocators 101 - Write a simple memory allocator

     

    Code related to this article: github.com/arjun024/memalloc

    This article is about writing a simple memory allocator in C.
    We will implement malloc(), calloc(), realloc() and free().

    This is a beginner level article, so I will not spell out every detail.
    This memory allocator will not be fast and efficient, we will not adjust allocated memory to align to a page boundary, but we will build a memory allocator that works. That’s it.

    If you want to take a look at the code in full, take a look at my github repo memalloc.

    Before we get into building the memory allocator, you need to be familiar with the memory layout of a program. A process runs within its own virtual address space that’s distinct from the virtual address spaces of other processes. This virtual address space typically comprises of 5 sections:

    Text section: The part that contains the binary instructions to be executed by the processor.
    Data section: Contains non-zero initialized static data.
    BSS (Block Started by Symbol) : Contains zero-initialized static data. Static data uninitialized in program is initialized 0 and goes here.
    Heap: Contains the dynamically allocated data.
    Stack: Contains your automatic variables, function arguments, copy of base pointer etc. Memory layout

    As you can see in the image, the stack and the heap grow in the opposite directions.
    Sometimes the data, bss and heap sections are collectively referred to as the “data segment”,
    the end of which is demarcated by a pointer named program break or brk.
    That is, brk points to the end of the heap.

    Now if we want to allocate more memory in the heap, we need to request the system to increment brk. Similarly, to release memory we need to request the system to decrement brk.

    Assuming we run Linux (or a Unix-like system), we can make use of sbrk() system call that lets us manipulate the program break.

    Calling sbrk(0) gives the current address of program break.
    Calling sbrk(x) with a positive value increments brk by x bytes, as a result allocating memory.
    Calling sbrk(-x) with a negative value decrements brk by x bytes, as a result releasing memory.
    On failure, sbrk() returns (void*) -1.

    To be honest, sbrk() is not our best buddy in 2015. There are better alternatives like mmap() available today. sbrk() is not really thread safe. It can can only grow or shrink in LIFO order.
    If you do a man 2 sbrk on your macbook, it will tell you:

    The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.

    However, the glibc implementation of malloc still uses sbrk() for allocating memory that’s not too big in size.[1]
    So, we will go ahead with sbrk() for our simple memory allocator.

    malloc()

    The malloc(size) function allocates size bytes of memory and returns a pointer to the allocated memory.
    Our simple malloc will look like:

    void *malloc(size_t size)
    {
    	void *block;
    	block = sbrk(size);
    	if (block == (void*) -1)
    		return NULL;
    	return block;
    }
    

    In the above code, we call sbrk() with the given size.
    On success, size bytes are allocated on the heap.
    That was easy. Wasn’t it?

    The tricky part is freeing this memory.
    The free(ptr) function frees the memory block pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc().
    But to free a block of memory, the first order of business is to know the size of the memory block to be freed. In the current scheme of things, this is not possible as the size information is not stored anywhere. So, we will have to find a way to store the size of an allocated block somewhere.

    Moreover, we need to understand that the heap memory the operating system has provided is contiguous. So we can only release memory which is at the end of the heap. We can’t release a block of memory in the middle to the OS. Imagine your heap to be something like a long loaf of bread that you can stretch and shrink at one end, but you have to keep it in one piece.
    To address this issue of not being able to release memory that’s not at the end of the heap, we will make a distinction between freeing memory and releasing memory.
    From now on, freeing a block of memory does not necessarily mean we release memory back to OS. It just means that we keep the block marked as free. This block marked as free may be reused on a later malloc() call. Since memory not at the end of the heap can’t be released, this is the only way ahead for us.

    So now, we have two things to store for every block of allocated memory:
        1. size
        2. Whether a block is free or not-free?

    To store this information, we will add a header to every newly allocated memory block.
    The header will look something like this:

    struct header_t {
    	size_t size;
    	unsigned is_free;
    };
    

    The idea is simple. When a program requests for size bytes of memory, we calculate total_size = header_size + size, and call sbrk(total_size). We use this memory space returned by sbrk() to fit in both the header and the actual memory block. The header is internally managed, and is kept completely hidden from the calling program.

    Now, each one of our memory blocks will look like:
    memory block with header

    We can’t be completely sure the blocks of memory allocated by our malloc is contiguous. Imagine the calling program has a foreign sbrk(), or there’s a section of memory mmap()ed in between our memory blocks. We also need a way to traverse through our blocks for memory (why traverse? we will get to know when we look at the implementation of free()). So to keep track of the memory allocated by our malloc, we will put them in a linked list. Our header will now look like:

    struct header_t {
    	size_t size;
    	unsigned is_free;
    	struct header_t *next;
    };
    

    and the linked list of memory blocks like this:

    linked list of memory blocks

    Now, let’s wrap the entire header struct in a union along with a stub variable of size 16 bytes. This makes the header end up on a memory address aligned to 16 bytes. Recall that the size of a union is the larger size of its members. So the union guarantees that the end of the header is memory aligned. The end of the header is where the actual memory block begins and therefore the memory provided to the caller by the allocator will be aligned to 16 bytes.

    typedef char ALIGN[16];
    
    union header {
    	struct {
    		size_t size;
    		unsigned is_free;
    		union header *next;
    	} s;
    	ALIGN stub;
    };
    typedef union header header_t;
    

    We will have a head and tail pointer to keep track of the list.

    header_t *head, *tail;
    

    To prevent two or more threads from concurrently accessing memory, we will put a basic locking mechanism in place.

    We’ll have a global lock, and before every action on memory you have to acquire the lock, and once you are done you have to release the lock.

    pthread_mutex_t global_malloc_lock;
    

    Our malloc is now modified to:

    void *malloc(size_t size)
    {
    	size_t total_size;
    	void *block;
    	header_t *header;
    	if (!size)
    		return NULL;
    	pthread_mutex_lock(&global_malloc_lock);
    	header = get_free_block(size);
    	if (header) {
    		header->s.is_free = 0;
    		pthread_mutex_unlock(&global_malloc_lock);
    		return (void*)(header + 1);
    	}
    	total_size = sizeof(header_t) + size;
    	block = sbrk(total_size);
    	if (block == (void*) -1) {
    		pthread_mutex_unlock(&global_malloc_lock);
    		return NULL;
    	}
    	header = block;
    	header->s.size = size;
    	header->s.is_free = 0;
    	header->s.next = NULL;
    	if (!head)
    		head = header;
    	if (tail)
    		tail->s.next = header;
    	tail = header;
    	pthread_mutex_unlock(&global_malloc_lock);
    	return (void*)(header + 1);
    }
    
    header_t *get_free_block(size_t size)
    {
    	header_t *curr = head;
    	while(curr) {
    		if (curr->s.is_free && curr->s.size >= size)
    			return curr;
    		curr = curr->s.next;
    	}
    	return NULL;
    }
    

    Let me explain the code:

    We check if the requested size is zero. If it is, then we return NULL.
    For a valid size, we first acquire the lock. The we call get_free_block() - it traverses the linked list and see if there already exist a block of memory that is marked as free and can accomodate the given size. Here, we take a first-fit approach in searching the linked list.

    If a sufficiently large free block is found, we will simply mark that block as not-free, release the global lock, and then return a pointer to that block. In such a case, the header pointer will refer to the header part of the block of memory we just found by traversing the list. Remember, we have to hide the very existence of the header to an outside party. When we do (header + 1), it points to the byte right after the end of the header. This is incidentally also the first byte of the actual memory block, the one the caller is interested in. This is cast to (void*) and returned.

    If we have not found a sufficiently large free block, then we have to extend the heap by calling sbrk(). The heap has to be extended by a size that fits the requested size as well a header. For that, we first compute the total size: total_size = sizeof(header_t) + size;. Now, we request the OS to increment the program break: sbrk(total_size).

    In the memory thus obtained from the OS, we first make space for the header. In C, there is no need to cast a void* to any other pointer type, it is always safely promoted. That’s why we don’t explicitly do: header = (header_t *)block;
    We fill this header with the requested size (not the total size) and mark it as not-free. We update the next pointer, head and tail so to reflect the new state of the linked list. As explained earlier, we hide the header from the caller and hence return (void*)(header + 1). We make sure we release the global lock as well.

    free()

    Now, we will look at what free() should do. free() has to first deterimine if the block-to-be-freed is at the end of the heap. If it is, we can release it to the OS. Otherwise, all we do is mark it ‘free’, hoping to reuse it later.

    void free(void *block)
    {
    	header_t *header, *tmp;
    	void *programbreak;
    
    	if (!block)
    		return;
    	pthread_mutex_lock(&global_malloc_lock);
    	header = (header_t*)block - 1;
    
    	programbreak = sbrk(0);
    	if ((char*)block + header->s.size == programbreak) {
    		if (head == tail) {
    			head = tail = NULL;
    		} else {
    			tmp = head;
    			while (tmp) {
    				if(tmp->s.next == tail) {
    					tmp->s.next = NULL;
    					tail = tmp;
    				}
    				tmp = tmp->s.next;
    			}
    		}
    		sbrk(0 - sizeof(header_t) - header->s.size);
    		pthread_mutex_unlock(&global_malloc_lock);
    		return;
    	}
    	header->s.is_free = 1;
    	pthread_mutex_unlock(&global_malloc_lock);
    }
    

    Here, first we get the header of the block we want to free. All we need to do is get a pointer that is behind the block by a distance equalling the size of the header. So, we cast block to a header pointer type and move it behind by 1 unit.
    header = (header_t*)block - 1;

    sbrk(0) gives the current value of program break. To check if the block to be freed is at the end of the heap, we first find the end of the current block. The end can be computed as (char*)block + header->s.size. This is then compared with the program break.

    If it is in fact at the end, then we could shrink the size of the heap and release memory to OS. We first reset our head and tail pointers to reflect the loss of the last block. Then the amount of memory to be released is calculated. This the sum of sizes of the header and the acutal block: sizeof(header_t) + header->s.size. To release this much amount of memory, we call sbrk() with the negative of this value.

    In the case the block is not the last one in the linked list, we simply set the is_free field of its header. This is the field checked by get_free_block() before actually calling sbrk() on a malloc().

    calloc()

    The calloc(num, nsize) function allocates memory for an array of num elements of nsize bytes each and returns a pointer to the allocated memory. Additionally, the memory is all set to zeroes.
    void *calloc(size_t num, size_t nsize)
    {
    	size_t size;
    	void *block;
    	if (!num || !nsize)
    		return NULL;
    	size = num * nsize;
    	/* check mul overflow */
    	if (nsize != size / num)
    		return NULL;
    	block = malloc(size);
    	if (!block)
    		return NULL;
    	memset(block, 0, size);
    	return block;
    }
    

    Here, we do a quick check for multiplicative overflow, then call our malloc(),
    and clears the allocated memory to all zeroes using memset().

    realloc()

    realloc() changes the size of the given memory block to the size given.

    void *realloc(void *block, size_t size)
    {
    	header_t *header;
    	void *ret;
    	if (!block || !size)
    		return malloc(size);
    	header = (header_t*)block - 1;
    	if (header->s.size >= size)
    		return block;
    	ret = malloc(size);
    	if (ret) {
    		
    		memcpy(ret, block, header->s.size);
    		free(block);
    	}
    	return ret;
    }
    

    Here, we first get the block’s header and see if the block already has the size to accomodate the requested size. If it does, there’s nothing to be done.

    If the current block does not have the requested size, then we call malloc() to get a block of the request size, and relocate contents to the new bigger block using memcpy(). The old memory block is then freed.

    Compiling and using our memory allocator.

    You can get the code from my github repository - memalloc.
    We’ll compile our memory allocator and then run a utility like ls using our memory allocator.

    To do that, we will first compile it as a library file.

    $ gcc -o memalloc.so -fPIC -shared memalloc.c
    

    The -fPIC and -shared options makes sure the compiled output has position-independent code and tells the linker to produce a shared object suitable for dynamic linking.

    On Linux, if you set the enivornment variable LD_PRELOAD to the path of a shared object, that file will be loaded before any other library. We could use this trick to load our compiled library file first, so that the later commands run in the shell will use our malloc(), free(), calloc() and realloc().

    $ export LD_PRELOAD=$PWD/memalloc.so
    

    Now,

    $ ls
    memalloc.c		memalloc.so
    

    Voila! That’s our memory allocator serving ls.
    Print some debug message in malloc() and see it for yourself if you don’t believe me.

    Thank you for reading. All comments are welcome. Please report bugs if you find any.

    Footnotes, References

    See a list of memory allocators:
    liballoc
    Doug Lea’s Memory Allocator.
    TCMalloc
    ptmalloc

    [1] The GNU C Library: Malloc Tunable Parameters
    OSDev - Memory allocation
    Memory Allocators 101 - James Golick

  2. cmemoryMemory Allocationoperating system






     

    Disclaimer: The views expressed here are solely those of the author in his private capacity and do not in any way represent the views of the author's employer or any organization associated with the author.

    points






















 



















































 

Check this out:

Wikicoding, the wikipedia of code

Recent Posts:

Simplicity is the ultimate sophistication.

©

Arjun Sreedharan 2013 - 2023

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/75592.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

Grafana展示k8s中pod的jvm监控面板/actuator/prometheus

场景 为保障java服务正常运行,对服务的jvm进行监控,通过使用actuator组件监控jvm情况,使用prometheus对数据进行采集,并在Grafana展现。 基于k8s场景 prometheus数据收集 配置service的lable,便于prometheus使用labl…

Python Flask+Echarts+sklearn+MySQL(评论情感分析、用户推荐、BI报表)项目分享

Python FlaskEchartssklearnMySQL(评论情感分析、用户推荐、BI报表)项目分享 项目背景: 随着互联网的快速发展和智能手机的普及,人们越来越倾向于在网上查找餐厅、购物中心、酒店和旅游景点等商户的点评和评分信息,以便做出更好的消费决策。…

Android 广播发送流程分析

在上一篇文章中Android 广播阻塞、延迟问题分析方法讲了广播阻塞的分析方法,但是分析完这个问题,自己还是有一些疑问: 广播为啥会阻塞呢?发送给接收器就行了,为啥还要等着接收器处理完才处理下一个?由普通…

【不限于联想Y9000P电脑关盖再打开时黑屏的解决办法】

不限于联想Y9000P电脑关盖再打开时黑屏的解决办法 问题的前言问题的出现问题拟解决 问题的前言 事情发生在昨天,更新了Win11系统后: 最惹人注目的三处地方就是: 1.可以查看时间的秒数了; 2.右键展示的内容变窄了; 3.按…

205、仿真-51单片机直流数字电流表多档位切换Proteus仿真设计(程序+Proteus仿真+原理图+流程图+元器件清单+配套资料等)

毕设帮助、开题指导、技术解答(有偿)见文未 目录 一、硬件设计 二、设计功能 三、Proteus仿真图 四、原理图 五、程序源码 资料包括: 方案选择 单片机的选择 方案一:STM32系列单片机控制,该型号单片机为LQFP44封装,内部资源…

等保案例 1

用户简介 吉林省人力资源和社会保障厅(简称“吉林省人社厅”)响应《网络安全法》的建设要求,为了向吉林省人民提供更好、更快、更稳定的信息化服务,根据《网络安全法》和等级保护2.0相关标准,落实网络安全与信息化建设…

【1572. 矩阵对角线元素的和】

来源:力扣(LeetCode) 描述: 给你一个正方形矩阵 mat,请你返回矩阵对角线元素的和。 请你返回在矩阵主对角线上的元素和副对角线上且不在主对角线上元素的和。 示例 1: 输入:mat [[1,2,3]…

uniapp 官方扩展组件 uni-combox 实现:只能选择不能手写(输入中支持过滤显示下拉列表)

uniapp 官方扩展组件 uni-combox 实现:只能选择不能手写(输入中支持过滤显示下拉列表) uni-comboxuni-combox 原本支持:问题: 改造源码参考资料 uni-combox uni-combox 原本支持: 下拉选择。输入关键字&am…

24届近3年南京信息工程大学自动化考研院校分析

今天给大家带来的是南京信息工程大学控制考研分析 满满干货~还不快快点赞收藏 一、南京信息工程大学 学校简介 南京信息工程大学位于南京江北新区,是一所以大气科学为特色的全国重点大学,由江苏省人民政府、中华人民共和国教育部、中国气…

轻量级自动化测试框架WebZ

一、什么是WebZ WebZ是我用Python写的“关键字驱动”的自动化测试框架,基于WebDriver。 设计该框架的初衷是:用自动化测试让测试人员从一些简单却重复的测试中解放出来。之所以用“关键字驱动”模式是因为我觉得这样能让测试人员(测试执行人员…

7-2 成绩转换

分数 15 全屏浏览题目 切换布局 作者 沈睿 单位 浙江大学 本题要求编写程序将一个百分制成绩转换为五分制成绩。转换规则: 大于等于90分为A;小于90且大于等于80为B;小于80且大于等于70为C;小于70且大于等于60为D;小…

gitlab修改远程仓库地址

目录 背景: 解决: 1.删除本地仓库关联的远程地址,添加新的远程仓库地址 2.直接修改本地仓库关联的远程仓库地址 3.打开.git隐藏文件修改远程仓库地址 4.拉取代码报错(git host key verification failed) 背景: 公司搬家&#…

数据可视化工具的三大类报表制作流程分享

电脑(pc)、移动、大屏三大类型的BI数据可视化报表制作步骤基本相同,差别就在于尺寸调整和具体的报表布局。这对于采用点击、拖拉拽方式来制作报表的奥威BI数据可视化工具来说就显得特别简单。接下来,我们就一起看看不这三大类型的…

【第三阶段】kotlin语言中的先决条件函数

用于函数内部判断异常,节省开发 1.checkNotNull()如果传入为null则抛出异常 fun main() {var name:String?nullcheckNotNull(name) }执行结果 2.requireNotNull ()如果传入为null则抛出异常 fun main() {var name:String?nullrequireNot…

【图像分类】理论篇(4)图像增强opencv实现

随机旋转 随机旋转是一种图像增强技术,它通过将图像以随机角度进行旋转来增加数据的多样性,从而帮助改善模型的鲁棒性和泛化能力。这在训练深度学习模型时尤其有用,可以使模型更好地适应各种角度的输入。 原图像: 旋转后的图像&…

快手商品详情数据API 抓取快手商品价格、销量、库存、sku信息

快手商品详情数据API是用来获取快手商品详情页数据的接口,请求参数为商品ID,这是每个商品唯一性的标识。返回参数有商品标题、商品标题、商品简介、价格、掌柜昵称、库存、宝贝链接、宝贝图片、商品SKU等。 接口名称:item_get 公共参数 名…

[oneAPI] BERT

[oneAPI] BERT BERT训练过程Masked Language Model(MLM)Next Sentence Prediction(NSP)微调 总结基于oneAPI代码 比赛:https://marketing.csdn.net/p/f3e44fbfe46c465f4d9d6c23e38e0517 Intel DevCloud for oneAPI&…

【云原生】Kubernetes 概述

Kubernetes 概述 1.Kubernetes 简介 Kubernetes 是一个可移植的、可扩展的、用于管理容器化工作负载和服务的开源平台,它简化(促进)了声明式配置和自动化。它有一个庞大的、快速增长的生态系统。Kubernetes 的服务、支持和工具随处可见。 K…

电脑-C盘结构

一 缓存文件 winR 输入%temp% 就会进入到电脑缓存目录 这里面的东西都可以删除 主要目录在User/xxx/AppData\Local\Temp 二 临时文件 C盘右键,详细信息 三 桌面文件 文件类型 program data表示是游戏存档/系统/软件的配置文件 drivers文件表示驱动程序文件 s…

cs231n assignment3 q1Network Visualization

文章目录 嫌啰嗦直接看代码Q1 :Network Visualizationcompute_saliency_maps题面解析代码输出 make_fooling_image题面解析代码输出 class_visualization_update_step题面解析代码输出 结语 嫌啰嗦直接看代码 Q1 :Network Visualization compute_saliency_maps 题面 这部分的…