Quick Start Guide Part 2: Linux Provisioning for Intel® Optane™ Persistent Memory

ID 标签 733787
已更新 6/16/2020
版本 Latest
公共

author-image

作者

Introduction

Intel® Optane™ persistent memory (PMem) is a disruptive technology that creates a new tier between memory and storage. Intel Optane™ PMem modules support two modes: Memory Mode for volatile use cases and App Direct Mode that provides byte-addressable persistent storage. 

In this portion of the Quickstart Guide to Provisioning Pmem, we describe some methods of provisioning  PMem using ipmctl and ndctl utilities within a Linux environment. 

  • ipmctl utility is necessary for discovering Intel Optane persistent memory module resources, creating goals and regions, updating the firmware, and debugging issues. 
  • ndctl utility is used to configure the namespaces

This quick start guide (QSG) has three sections:

  1. Quick Start Guide Part 1: Persistent Memory Provisioning Introduction
  2. Quick Start Guide Part 2: Provisioning for Intel® Optane™ Persistent Memory on Linux
  3. Quick Start Guide Part 3: Provisioning for Intel® Optane™ Persistent Memory on Microsoft Windows

Linux Support for Intel Optane Persistent Memory

Most Linux distributions support persistent memory. A list of Compatible Operating System OS for Intel® Optane™ Persistent Memory details if Memory Mode, AppDirect, or Mixed-Mode are supported. 

Kernel Support

The Linux NVDIMM/PMem drivers are enabled by default, starting with Linux mainline kernel 4.2. We recommend mainline kernel version 4.19 or later to deliver Reliability, Availability and Serviceability (RAS) features required by the Persistent Memory Development Kit (PMDK).  

Custom Kernel Support

Whether you use the default kernel provided by your distro, compile your own, or install a Kernel from another source, it is worth confirming that your supports PMem. The following example was captured on a v5.8 kernel and showed how to verify what features are available and enabled or disabled in that kernel. The results will vary depending on the kernel version. 

# egrep -i "zone_device|hugepage|nfit|nvdimm|pmem|_nd_|btt|dax|memory_hotplug" /boot/config-$(uname -r)
CONFIG_X86_PMEM_LEGACY_DEVICE=y
CONFIG_X86_PMEM_LEGACY=m
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_ACPI_NFIT=m
# CONFIG_NFIT_SECURITY_DEBUG is not set
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_ZONE_DEVICE=y
# CONFIG_VIRTIO_PMEM is not set
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
CONFIG_LIBNVDIMM=m
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BLK=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_BTT=y
CONFIG_ND_PFN=m
CONFIG_NVDIMM_PFN=y
CONFIG_NVDIMM_DAX=y
CONFIG_NVDIMM_KEYS=y
CONFIG_DAX_DRIVER=y
CONFIG_DAX=y
CONFIG_DEV_DAX=m
CONFIG_DEV_DAX_PMEM=m
CONFIG_DEV_DAX_HMEM=m
CONFIG_DEV_DAX_KMEM=m
# CONFIG_DEV_DAX_PMEM_COMPAT is not set
CONFIG_FS_DAX=y
CONFIG_FS_DAX_PMD=y
CONFIG_ARCH_HAS_PMEM_API=y

Software: Non-volatile Device Control (ndctl)

The vendor-neutral ndctl command is used to manage namespaces. The utility is designed to work with NVDIMMs from different vendors, including Intel Optane PMem.
ndctl supports the following functionality:

  • Show PMem module information
  • Manage namespaces and configuration labels
  • Monitor health
  • Manage security - passphrases and secure erase
  • Error injection and testing

For more information, refer to the ndctl User Guide.

Installing ndctl

The ndctl utility is available in most Linux package repositories, or you can download and compile the source code, which is available on the ndctl GitHub project page.
For example, in Ubuntu Linux, you can do:

apt-get install ndctl libndctl6 libndctl-dev

Further information can be found in Installing NDCTL and DAXTL documentation.

Getting Help

The full list of commands is available by running `ndctl --list-cmds.` Each of the commands is documented in the man pages and accessible through the `ndctl help` interface.
For a comprehensive review of this utility and to get further assistance, access the following resources:

IPMCTL

ipmctl is an open source utility created and maintained by Intel to manage Intel® Optane™ persistent memory modules. The ipmctl tool is available for both Linux and Windows. Refer to the ipmctl GitHub project for more information. 

Installing ipmctl

Detailed step-by-step instructions can be found in Installing IPMCTL documentation. On most Linux environments, you need to use the package manager to install the ipmctl package and dependencies, for example:

$ sudo [apt|dnf|yum|zypper] install ipmctl

Getting Help

A full list of commands is available by running 'man ipmctl' or 'ipmctl help' from the command line.
For a comprehensive review of this utility and to get further assistance, access the following resources :

Tutorial: Provisioning Persistent Memory

This tutorial provides instructions for configuring and managing Intel® Optane™ persistent memory (PMem) modules in Memory Mode, App Direct Mode, or mixed mode on a Linux host. After using ipmctl to discover and provision the modules into either Memory or App Direct mode, we will use the Non-volatile Device Control (ndctl) utility to finish the configuration. All the commands described in this section are demonstrated on a two-socket system with 6 Terabytes (TB) of PMem, and 384 gigabytes (GB) of DDR4 DRAM.

Note All ndctl and icmptl snippets and screenshots below are from a server running Ubuntu 20.0.4 and ipmctl snippets are from ipmctl v2.x

Prerequisites

For Provisioning Intel PMem, the following is needed: 

  • Intel Optane PMem Modules are installed
  • The ipmctl utility is used to configure and manage PMem modules. 
    • To determine if ipmctl is already installed, type 'which ipmctl' into the terminal window.
    • If ipmctl is not installed, refer to the Installing ipmctl user guide.
  • The ndctl utility is used to manage the regions and namespaces.
    • To determine if ndctl is already installed, type 'which ndctl' into the terminal window.
    • If ndctl is not installed, refer to the Installing ndctl  user guide.

Step 1: PMem Discovery

This tutorial considers that an operating mode is pre-configured by the system provider. New systems are commonly pre-configured in App Direct Mode or Memory Mode based on the customer requirement. 

  1. ipmctl show -dimm command displays the PMem modules discovered in the system and verifies that software can communicate with them. Among other information, this command outputs each DIMM ID, capacity, health state, and firmware version. The PMem module firmware and BIOS releases are available from the system provider
    # sudo ipmctl show -dimm
    DimmID | Capacity    | LockState | HealthState | FWVersion
    ===============================================================
     0x0001 | 126.422 GiB | Disabled  | Healthy     | 01.02.00.5435
     0x0011 | 126.422 GiB | Disabled  | Healthy     | 01.02.00.5435
     0x0021 | 126.422 GiB | Disabled  | Healthy     | 01.02.00.5435
    ….
     0x1121 | 126.422 GiB | Disabled  | Healthy     | 01.02.00.5435
  2. ipmctl show -topology command identifies which memory slots the DDR and PMem modules are installed in​
    sudo ipmctl show -topology
    $ sudo ipmctl show -topology
     DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator
    ================================================================================
     0x0001 | Logical Non-Volatile Device | 126.422 GiB | 0x0026    | CPU1_DIMM_A2
     0x0011 | Logical Non-Volatile Device | 126.422 GiB | 0x0028    | CPU1_DIMM_B2
     0x0021 | Logical Non-Volatile Device | 126.422 GiB | 0x002a    | CPU1_DIMM_C2
     0x0101 | Logical Non-Volatile Device | 126.422 GiB | 0x002c    | CPU1_DIMM_D2
     0x0111 | Logical Non-Volatile Device | 126.422 GiB | 0x002e    | CPU1_DIMM_E2
     0x0121 | Logical Non-Volatile Device | 126.422 GiB | 0x0030    | CPU1_DIMM_F2
     0x1001 | Logical Non-Volatile Device | 126.422 GiB | 0x0032    | CPU2_DIMM_A2
     0x1011 | Logical Non-Volatile Device | 126.422 GiB | 0x0034    | CPU2_DIMM_B2
     0x1021 | Logical Non-Volatile Device | 126.422 GiB | 0x0036    | CPU2_DIMM_C2
     0x1101 | Logical Non-Volatile Device | 126.422 GiB | 0x0038    | CPU2_DIMM_D2
     0x1111 | Logical Non-Volatile Device | 126.422 GiB | 0x003a    | CPU2_DIMM_E2
     0x1121 | Logical Non-Volatile Device | 126.422 GiB | 0x003c    | CPU2_DIMM_F2
     N/A    | DDR4                        | 32.000 GiB  | 0x0025    | CPU1_DIMM_A1
     N/A    | DDR4                        | 32.000 GiB  | 0x0027    | CPU1_DIMM_B1
     N/A    | DDR4                        | 32.000 GiB  | 0x0029    | CPU1_DIMM_C1
     N/A    | DDR4                        | 32.000 GiB  | 0x002b    | CPU1_DIMM_D1
     N/A    | DDR4                        | 32.000 GiB  | 0x002d    | CPU1_DIMM_E1
     N/A    | DDR4                        | 32.000 GiB  | 0x002f    | CPU1_DIMM_F1
     N/A    | DDR4                        | 32.000 GiB  | 0x0031    | CPU2_DIMM_A1
     N/A    | DDR4                        | 32.000 GiB  | 0x0033    | CPU2_DIMM_B1
     N/A    | DDR4                        | 32.000 GiB  | 0x0035    | CPU2_DIMM_C1
     N/A    | DDR4                        | 32.000 GiB  | 0x0037    | CPU2_DIMM_D1
     N/A    | DDR4                        | 32.000 GiB  | 0x0039    | CPU2_DIMM_E1
     N/A    | DDR4                        | 32.000 GiB  | 0x003b    | CPU2_DIMM_F1
    
  3. ipmctl show -memoryresources will report information about the current configuration if it has been configured. If no configuration has been created, ipmctl will return a message similar to "One or more PMem modules have invalid PCD data. A platform reboot is recommended to restore valid PCD data, then try again." In the output below, the MemoryCapacity and AppDirectCapacity values are used to determine if the system was configured in Memory mode, App Direct Mode, or mixed mode, here are a few examples:
    • If the system is in Memory Mode, you will see something like this (see Volatile row):
      # sudo ipmctl show -memoryresources
       MemoryType	| DDR         | PMemModule       | Total
      ==========================================================
       Volatile    	| 189.500 GiB | 1512.000 GiB 	| 1512.000 GiB
       AppDirect   	|           - | - 			| -
       Cache       	| 0.000 GiB   | -          	| 0.000 GiB
       Inaccessible	| 2.500 GiB   | 5.066 GiB		| 7.566 GiB
       Physical    	| 192.000 GiB | 1517.066 GiB 	| 1709.066 GiB 
      

       

    • If the system is in AppDirect, you will see something like this (see App Direct Row): 
      MemoryType	| DDR         | PMemModule       | Total
      ==========================================================
       Volatile    	| 189.500 GiB | 0.000 GiB    	| 189.500 GiB
       AppDirect   	|           - | 1512.000 GiB 	| 1512.000 GiB
       Cache       	| 0.000 GiB   | -          	| 0.000 GiB
       Inaccessible	| 2.500 GiB   | 5.066 GiB		| 7.566 GiB
       Physical    	| 192.000 GiB | 1517.066 GiB 	| 1709.066 GiB 
      

       

    • If the system is in Mixed Mode (50/50), you will see something like this (see volatile and AppDirect rows):
      MemoryType	| DDR         | PMemModule       | Total
      ==========================================================
       Volatile    	| 189.500 GiB | 758.000 GiB 	| 758.000 GiB
       AppDirect   	|           - | 758.000 GiB      | 758.000 GiB 
       Cache       	| 0.000 GiB   | -          	| 0.000 GiB
       Inaccessible	| 2.500 GiB   | 5.066 GiB		| 7.566 GiB
       Physical    	| 192.000 GiB | 1517.066 GiB 	| 1709.066 GiB 
      

Step 2: Determine a Configuration Goal 

At this point, if your system is configured as you need it, consider the following step as just information. However, if you need to make changes to the configuration, this section explains the basics. If you need to change the platform configuration, such as Memory mode to App Direct Mode, see appendix A  on resetting the system.

When setting a goal, we set the Intel Optane PMem modules to be used in one of the three standard modes; Memory Mode, App Direct, and mixed. Once the mode is set, the system needs to be rebooted for the changes to go into effect.

The first thing you need to do is decide which of the three modes you require and run the appropriate configuration command

  • Memory Mode - 100% of PMem capacity across sockets can be provisioned in Memory mode, as described below. In this example, 100% of the available PMem capacity is to be provisioned in Memory mode. You can always use the -f option to overwrite any existing goal. The system must comply with recommended DRAM to PMem capacity ratios. At a minimum, a system must have at least a 1:4 ratio and no more than 1:16 per CPU socket. For example, a system with 192GB of PMem should have at least 768GB of PMem. A warning may be reported by ipmctl and the BIOS at boot time if the configuration does not meet the requirements.
     
  • App Direct Mode - PMem can be provisioned in App Direct Mode with interleaving enabled or disabled. As described in the QSG Guide Part 1,  interleaving increases the throughput of reads and writes to PMem.
     
  • Mixed Modes - A percentage of PMem capacity can be provisioned in Memory mode, as described below. Any remaining PMem capacity can either be reserved or used to create an AppDirect region. 

Step 3: Configure your Platform

At this point, we have three options, choose the sub-tutorial below to configure your system.

Memory Mode Example

In this example, 100% of the available PMem capacity is to be provisioned in Memory Mode. 

  1. Enter the following on the command line ipmctl create -goal memorymode=100
    # ipmctl create -goal memorymode=100
    
    The following configuration will be applied:
     SocketID | DimmID | MemorySize | AppDirect1Size | AppDirect2Size
    ==================================================================
     0x0000   | 0x0001 | 502.0 GiB | 0.0 GiB        | 0.0 GiB
     0x0000   | 0x0011 | 502.0 GiB | 0.0 GiB        | 0.0 GiB
     0x0000   | 0x0021 | 502.0 GiB | 0.0 GiB        | 0.0 GiB
    ...
     0x0001   | 0x1111 | 502.0 GiB | 0.0 GiB        | 0.0 GiB
     0x0001   | 0x1121 | 502.0 GiB | 0.0 GiB        | 0.0 GiB
  2.  A reboot is required to process new memory allocation goals.

App Direct Example

In this example, we configure the system into App Direct Mode.

  1. Enter the following on the command line  ipmctl create -goal persistentmemorytype=appdirect
    # ipmctl create -goal persistentmemorytype=appdirect
    
    The following configuration will be applied:
     SocketID | DimmID | MemorySize | AppDirect1Size | AppDirect2Size
    ==================================================================
     0x0000   | 0x0011 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0000   | 0x0021 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0000   | 0x0001 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
    ...
     0x0001   | 0x1121 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0001   | 0x1101 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
    Do you want to continue? [y/n] y
    Created following region configuration goal
     SocketID | DimmID | MemorySize | AppDirect1Size | AppDirect2Size
    ==================================================================
     0x0000   | 0x0011 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0000   | 0x0021 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0000   | 0x0001 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
    ...
     0x0001   | 0x1121 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
     0x0001   | 0x1101 | 0.0 GiB    | 502.0 GiB      | 0.0 GiB
  2. A reboot is required to process new memory allocation goals.

Mixed Mode Example

In this example, 60% of the available PMem capacity is to be provisioned in Memory Mode.

  1. Enter the following on the command line ipmctl create -goal memorymode=60
    # ipmctl create -goal Memorymode=60
     SocketID | DimmID | MemorySize | AppDirect1Size | AppDirect2Size
    ==================================================================
     0x0000   | 0x0011 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0000   | 0x0021 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0000   | 0x0001 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
    ...
     0x0001   | 0x1121 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0001   | 0x1101 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
    Do you want to continue? [y/n] y
    Created following region configuration goal
     SocketID | DimmID | MemorySize | AppDirect1Size | AppDirect2Size
    ==================================================================
     0x0000   | 0x0011 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0000   | 0x0021 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0000   | 0x0001 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
    ...
     0x0001   | 0x1121 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
     0x0001   | 0x1101 | 310.0 GiB | 192.0 GiB      | 0.0 GiB
  2. A reboot is required to process new memory allocation goals.

Step 4: Provisioning of Intel Optane Persistent Memory Namespaces 

Having provisioned your system in one of the three modes, we now need to create the namespaces(s) for App Direct Mode and mixed-mode. If your device was configured in memory mode (ex: ipmctl create -goal memorymode=100), your device is fully configured and ready for use.

Prerequisites

The system is in App Direct or mixed mode as described in the tutorial within the: Quick Start Guide Part 1: Provisioning Basics for Intel® Optane™ Persistent Memory

Tutorial: How to create a namespace

Now we need to create a namespace for the application(s) to run. The namespace size can be any percentage of the overall PMem memory capacity of the installed PMem modules. Please run the following commands to configure the system.

  1. We need to identify if we want to configure the entire PMem capacity or a portion. 
     
    • If needing a single namespace across all regions use the following command: sudo ndctl create-namespace
       
    • If designating only a portion of the persistent memory capacity:  We need to run a command such as sudo ndctll create-namespace -m fsdax -s 54G
        
      1. Lets first understand what the command is trying to achieve
        1. Argument details
          • fsdax - Creates a new namespace in fsdax mode. The size of the namespace is 50GiB, minus the size of the metadata. Other modes supported include sector, devdax, and raw.
          • 54G - This allocates 50 GiB to this namespace. You can use any amount up to the full capacity.
          • Remember: when in Mixed mode, the full capacity is reduced accordingly, due to the allocation of x% into memory mode
        2. Argument Results
          • This command creates a new /dev/pmem{X[.Y]} device
          • The X value represents the region of the namespace. The namespace defaults to zero (0).
          • When working with multiple namespaces within a region, the naming convention is pmem{X.Y} where Y represents a sequentially increasing integer value for the new namespace.
          • The first namespace is pmem0 Next, we verify if the namespace was created
            $ sudo ndctl create-namespace -m fsdax -s 54G
            {
              "dev":"namespace1.0",
              "mode":"fsdax",
              "map":"dev",
              "size":"53.15 GiB (57.07 GB)",
              "uuid":"3879f23c-c3c3-4835-8950-fca3169056fd",
              "sector_size":512,
              "align":2097152,
              "blockdev":"pmem1"
            } 

             

  2. Enter the following on the command line ls -l /dev/pmem* to verify if the namespace was created.
    ls -l /dev/pmem*
    brw-rw---- 1 root disk 259, 4 Sep 30 10:54 /dev/pmem1 
  3. Enter the following on the command line ndctl create-namespace -n "MyApp" for tagging purposes.
    • Tagging the namespace with a friendly name/ description using the -n argument is particularly useful when provisioning space for multiple end-users or applications or to identify a namespace with for production or test/dev environments
      {
       "dev":"namespace0.0",
       "mode":"fsdax",
       "map":"dev",
       "size":"123.04 GiB (132.12 GB)",
       "uuid":"6a0abb59-5279-4731-921a-0099101e17f2",
       "raw_uuid":"03b40e23-56e1-407a-b5d1-f1ec929645c1",
       "sector_size":512,
       "blockdev":"pmem0",
       "name":"MyApp",
       "numa_node":0
      }
      

Note replace MyApp with your choice of a name

  1. Enter the following on the command line sudo ndctl create-namespace, to designate the remaining pmem capacity.
    # ndctl create-namespace
    {
    "dev":"namespace0.1",
    "mode":"fsdax",
    "map":"dev",
    "size":"73.83 GiB (79.27 GB)",
    "uuid":"d7f9473e-97aa-48cf-aefa-128797c83e88",
    "raw_uuid":"6c116e57-19dd-43d8-ae03-039f1588a23a",
    "sector_size":512,
    "blockdev":"pmem0.1",
    "numa_node":0
    }

Note Only run this command after all the required regions are created, as shown above.

  1. Enter the following on the command line ndctl list -N, to see the list of all the namespaces created: 
    # ndctl list -N
     [
      {
        "dev":"namespace1.0",
        "mode":"devdax",
        "map":"dev",
        "size":799063146496,
        "uuid":"0d352296-b85b-4acf-859c-636369741672",
        "chardev":"dax1.0",
        "align":2097152
      },
      {
        "dev":"namespace0.0",
        "mode":"devdax",
        "map":"dev",
        "size":799063146496,
        "uuid":"2c912ac4-5f13-4dfa-b855-7d9b30bb5c7b",
        "chardev":"dax0.0",
        "align":2097152
      }
    ]
    
  2. Enter the following on the command line ls -l /dev/pmem*, to check out the Persistent Memory device created.
  3. Enter the following on the command line mkfs.ext4 -f /dev/pmem1 to create the file system and mount it on the PMem device.
  4. Enter the following on the command line mkdir /mnt/myPMem to create a mount point.
  5. Enter the following on the command line mount -o dax /dev/pmem1 /mnt/myPMem to mount the file system with the-o dax option.
  6. The system is now ready to be used by the application.

Resources:

This concludes the tutorial for provisioning PMem within Linux. See below for a link to other articles in this series.

Appendix: How to factory reset the persistent memory

Warning This information is supplemental only and is not necessary for the tutorial above. This is a permanent action. All data will be lost.

Quick Method

  1. sudo ndctl destroy-namespace -f all
  2. OS> ipmctl delete -f -dimm -pcd’
    1. Deletes the Goal aka PCD (Platform Config Data)
  3. Reboot the system

Long Method

  1. Log in under sudo
  2. ndctl list -N
    • Get a list of namespaces
  3. Unmount the file system from the namespace needing to be removed
  4. Disable namespaces (repeat steps for all namespaces required)

Warning Disabling a namespace while it is mounted or in use results in undefined behavior by the application(s) using the namespace. Always stop any applications and unmount file systems in fsdax mode before disabling a namespace.

  • ndctl disable-namespace
  1. ndctl destroy-namespace
    • Destroy the namespace
  2. ipmctl delete -f -dimm -pcd
    • Delete the Goal aka PCD (Platform Config Data)
  3. Reboot the system
"