Code Sample: Create a Persistent Memory-Aware Queue Using the Persistent Memory Development Kit (PMDK)

ID 标签 686901
已更新 3/12/2018
版本 Latest
公共

author-image

作者

Introduction

This article shows how to implement a persistent memory (PMEM)-aware queue using a linked list and the C++ bindings of the Persistent Memory Development Kit (PMDK) library libpmemobj.

A queue is a first in first out (FIFO) data structure that supports push and pop operations. In a push operation, a new element is added to the tail of the queue. In a pop operation, the element at the head of the queue gets removed. These operations require multiple separate stores. For example, a push operation requires two stores: a tail pointer, and the next pointer of the last element.  

A PMEM-aware queue differs from a standard queue in that its data structures reside permanently in persistent memory. and a program or machine crash at a time when there is an incomplete queue entry could result in a memory leak or a corrupted data structureTo avoid this, queue operations must be made transactional. PMDK provides support for transactional and atomic operations specific to persistent memory.

We'll walk through a code sample that describes the core concepts and design considerations for creating a PMEM-aware queue using libpmemobj. You can build and run the code sample by following the instructions provided later in the article.

For background on persistent memory and the PMDK, read the article Introduction to Programming with Persistent Memory from Intel and watch the Persistent Memory Programming Video Series.

C++ Support in libpmemobj

The main features of the C++ bindings for libpmemobj include:

  • Transactions
  • Wrappers for basic types: automatically snapshots the data during a transaction
  • Persistent pointers

Transactions

Transactions are at the core of libpmemobj operations. This is because, in terms of persistence, the current x86-64 CPUs guarantee atomicity only for 8-byte stores. Real-world apps may update in larger chunks. Take, for example, strings; it rarely makes sense to change only eight adjacent bytes from one consistent string state to another. To enable atomic updates to persistent memory in larger chunks, libpmemobj implements transactions.

Libpmemobj uses undo log-based transactions so that in the case of an interruption in the middle of a transaction, all of the changes made to the persistent state will be rolled back.

Transactions are done on a per thread basis, so the call returns the status of the last transaction performed by the calling thread. Transactions are power-safe but not thread-safe. For more information, see C++ bindings for libpmemobj (part 6) - transactions at pmem.io.

The p<> template

In a transaction, undo logs are used to snapshot user data. The PMDK C library requires a manual snapshot to be performed before modifying data in a transaction. The C++ bindings do all of the snapshotting automatically, which reduces the likelihood of programmer error. The pmem::obj::p template wrapper class is the basic building block for this mechanism, and is designed to work with basic types only. Its implementation is based on the operator=(). Each time the assignment operator is called, it means that the value wrapped by p will be changed and the library needs to snapshot the old value. Use of the p<> property for stack variables is discouraged because snapshotting is a computationally intensive operation.

Persistent pointers

Libraries in PMDK are built on the concept of memory mapped files. Since files can be mapped at different addresses of the process virtual address space, traditional pointers that store absolute addresses cannot be used. Instead, PMDK introduces a new pointer type that has two fields: an ID to the pool (used to access current pool virtual address from a translation table), and an offset from the beginning of the pool. Persistent pointers are a C++ wrapper around this basic C type. Its philosophy is similar to that of std::shared_ptr.

libpmemobj Core Concepts

Root object

Making any code PMEM-aware using libpmemobj always involves, as a first step, designing the types of data objects that will be persisted. The first type that needs to be defined is that of the root object. This object is mandatory and used to anchor all the other objects created in the persistent memory pool (think of a pool as a file inside a PMEM device).

Pool

A pool is a contiguous region of PMEM identified by a user-supplied identifier called layout. Multiple pools can be created with different layout strings.

Queue Implementation using C++ Bindings

The queue in this example is implemented as a singly linked list, with a head and tail that demonstrates how to use the C++ bindings of libpmemobj.

Design Decisions

Data structures

The first thing we need is a data structure that describes a node in the queue. Each entry has a value and a link to the next node. As per the figure below, both variables are persistent memory-aware.

Data structure map
Figure 1. Data structure describing the queue implementation.

Code walkthrough

Now, let's go a little deeper into the main function of the program. While running the code you need to provide three arguments. One is the absolute location of the pool file, while the second one is the actual queue operation that needs to be performed. The supported operations in the queue are push (insert element), pop (return and remove element), and show (return element).

if (argc < 3) {
	std::cerr << "usage: " << argv[0]
	<< " file-name [push [value]|pop|show]" << std::endl;
	return 1;
}

In the snippet below, we check to see if the pool file exists. If it does, the pool is opened. If it doesn't exist, the pool is created. The layout string identifies the pool that we requested to open. Here we are opening the pool with layout name Queue as defined by the macro LAYOUT in the program.

const char *path = argv[1];
queue_op op = parse_queue_op(argv[2]);
pool<examples::pmem_queue> pop;

if (file_exists(path) != 0) {
	pop = pool<examples::pmem_queue>::create(
		path, LAYOUT, PMEMOBJ_MIN_POOL, CREATE_MODE_RW);
} else {
	pop = pool<examples::pmem_queue>::open(path, LAYOUT);
}

pop is the pointer to the pool from where we can access a pointer to the root object, which is an instance of examples::pmem_queue, and the Create function creates a new pmemobj pool of type examples::pmem_queue. The root object is like the root of a file system, since it can be used to reach all of the other objects in the pool (as long as these objects are linked properly and no pointers are lost due to coding errors).

auto q = pop.get_root();

Once you get the pointer to the queue object, the program checks the second argument in order to identify what type of action the queue should perform; that is, push, pop, or show.

switch (op) {
	case QUEUE_PUSH:
		q->push(pop, atoll(argv[3]));
		break;
	case QUEUE_POP:
		std::cout << q->pop(pop) << std::endl;
		break;
	case QUEUE_SHOW:
		q->show();
		break;
	default:
		throw std::invalid_argument("invalid queue operation");
}

Queue operations

Push

Let's look at how the push function is implemented to make it persistent programming-aware. As shown in the code below, the transactional code is implemented as a lambda function wrapped in a C++ closure (this makes it easy to read and follow the code). If a power failure happens the data structure does not get corrupted because all changes are rolled back. For more information how transactions are implemented in C++, read C++ bindings for libpmemobj (part 6) - transactions on pmem.io.

Allocation functions are transactional as well, and they use transaction logic to enable allocation/delete rollback of the persistent state; make_persistent() is the constructor, while delete_persistent() is the destructor.

Calling make_persistent() inside a transaction allocates an object and returns a persistent object pointer. As the allocation is now part of the transaction, if it aborts, the allocation is rolled back, reverting the memory allocation back to its original state.

After the allocation, the value of n is initialized to the new value in the queue, and the next pointer is set to null.

void push(pool_base &pop, uint64_t value) {
	transaction::exec_tx(pop, [&] {
		auto n = make_persistent<pmem_entry>();

		n->value = value;
		n->next = nullptr;

		if (head == nullptr && tail == nullptr) {
			head = tail = n;
		} else {
			tail->next = n;
			tail = n;
		}
	});
}

Data structure map for push functionality
2. Data structure for push functionality.

Pop

Similar to push, the pop function is shown below. Here we need a temporary variable to store a pointer to the next pmem_entry in the queue. This is needed in order to set the head of the queue to the next pmem_entry after deleting the head using delete_persistent(). Since this is done using a transaction, it is persistent-aware.

uint64_t pop(pool_base &pop){
	uint64_t ret = 0;
	transaction::exec_tx(pop, [&] {
		if (head == nullptr)
			transaction::abort(EINVAL);

		ret = head->value;
		auto n = head->next;

		delete_persistent<pmem_entry>(head);
		head = n;

		if (head == nullptr)
			tail = nullptr;
	});

	return ret;
}

Data structure map for pop functionality.
Figure 3. Data structure for pop functionality.

Build Instructions

Instructions to run the code sample

Download the source code from the PMDK GitHub* repository:

  1. Git clone https://github.com/pmem/pmdk.git

    command window with GitHub command
    Figure 4. Download source code from the GitHub* repository.

  2. cd pmdk and run make on the command line as shown below. This builds the complete source code tree.

    command window with code
    Figure 5. Building the source code.

  3. cd pmdk/src/examples/libpmemobj++/queue
  4. View command line options for the queue program:
    ./queue
  5. Push command:
    ./queue TESTFILE push 8

    Command window with code
    Figure 6. PUSH command using command line.

  6. Pop command:
    ./queue TESTFILE pop
  7. Show command:
    ./queue TESTFILE show

    Command window with code
    Figure 7. POP command using command line.

Summary

In this article, we showed a simple implementation of a PMEM-aware queue using the C++ bindings of the PMDK library libpmemobj. To learn more about persistent memory programming with PMDK, visit the Intel® Developer Zone (Intel® DZ) Persistent Memory Programming site. There you will find articles, videos, and links to other important resources for PMEM developers.

About the Author

Praveen Kundurthy is a Developer Evangelist with over 14 years of experience in application development, optimization and porting to Intel platforms. Over the past few years at Intel, he has worked on topics spanning Storage technologies, Gaming, Virtual reality and Android on Intel platforms.

"