madvise(2)


NAME

   madvise - give advice about use of memory

SYNOPSIS

   #include <sys/mman.h>

   int madvise(void *addr, size_t length, int advice);

   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

   madvise():
       Since glibc 2.19:
           _DEFAULT_SOURCE
       Up to and including glibc 2.19:
           _BSD_SOURCE

DESCRIPTION

   The  madvise()  system call is used to give advice or directions to the
   kernel about the address range beginning at address addr and with  size
   length   bytes.    Initially,  the  system  call  supported  a  set  of
   "conventional" advice values, which are also available on several other
   implementations.   (Note,  though,  that  madvise() is not specified in
   POSIX.)  Subsequently, a number of Linux-specific  advice  values  have
   been added.

   Conventional advice values
   The  advice values listed below allow an application to tell the kernel
   how it expects to use some mapped or shared memory areas, so  that  the
   kernel can choose appropriate read-ahead and caching techniques.  These
   advice values do not influence the semantics of the application (except
   in  the case of MADV_DONTNEED), but may influence its performance.  All
   of the advice values listed here have analogs  in  the  POSIX-specified
   posix_madvise(3)  function, and the values have the same meanings, with
   the exception of MADV_DONTNEED.

   The advice is indicated in the advice argument, which  is  one  of  the
   following:

   MADV_NORMAL
          No special treatment.  This is the default.

   MADV_RANDOM
          Expect  page references in random order.  (Hence, read ahead may
          be less useful than normally.)

   MADV_SEQUENTIAL
          Expect page references in sequential order.   (Hence,  pages  in
          the given range can be aggressively read ahead, and may be freed
          soon after they are accessed.)

   MADV_WILLNEED
          Expect access in the near future.  (Hence, it might  be  a  good
          idea to read some pages ahead.)

   MADV_DONTNEED
          Do  not  expect access in the near future.  (For the time being,
          the application is finished with the given range, so the  kernel
          can free resources associated with it.)

          After  a  successful  MADV_DONTNEED  operation, the semantics of
          memory access in the specified region  are  changed:  subsequent
          accesses  of pages in the range will succeed, but will result in
          either repopulating the  memory  contents  from  the  up-to-date
          contents   of  the  underlying  mapped  file  (for  shared  file
          mappings, shared anonymous mappings, and shmem-based  techniques
          such  as System V shared memory segments) or zero-fill-on-demand
          pages for anonymous private mappings.

          Note that, when applied to shared mappings, MADV_DONTNEED  might
          not  lead  to  immediate freeing of the pages in the range.  The
          kernel is free to delay freeing the pages until  an  appropriate
          moment.  The resident set size (RSS) of the calling process will
          be immediately reduced however.

          MADV_DONTNEED cannot be applied to locked pages, Huge TLB pages,
          or  VM_PFNMAP  pages.   (Pages  marked  with the kernel-internal
          VM_PFNMAP flag are special memory areas that are not managed  by
          the  virtual memory subsystem.  Such pages are typically created
          by device drivers that map the pages into user space.)

   Linux-specific advice values
   The following Linux-specific advice values have no counterparts in  the
   POSIX-specified  posix_madvise(3), and may or may not have counterparts
   in the madvise() interface available on  other  implementations.   Note
   that some of these operations change the semantics of memory accesses.

   MADV_REMOVE (since Linux 2.6.16)
          Free up a given range of pages and its associated backing store.
          This is equivalent to punching a hole in the corresponding  byte
          range  of  the  backing  store  (see  fallocate(2)).  Subsequent
          accesses  in  the  specified  address  range  will   see   bytes
          containing zero.

          The  specified address range must be mapped shared and writable.
          This flag cannot be applied to locked pages, Huge TLB pages,  or
          VM_PFNMAP pages.

          In  the  initial  implementation,  only  tmpfs(5)  is  supported
          MADV_REMOVE; but since Linux 3.5, any filesystem which  supports
          the   fallocate(2)   FALLOC_FL_PUNCH_HOLE   mode  also  supports
          MADV_REMOVE.  Hugetlbfs will fail  with  the  error  EINVAL  and
          other filesystems fail with the error EOPNOTSUPP.

   MADV_DONTFORK (since Linux 2.6.16)
          Do not make the pages in this range available to the child after
          a fork(2).  This is useful to  prevent  copy-on-write  semantics
          from  changing  the  physical  location  of a page if the parent
          writes to it after a  fork(2).   (Such  page  relocations  cause
          problems for hardware that DMAs into the page.)

   MADV_DOFORK (since Linux 2.6.16)
          Undo   the   effect  of  MADV_DONTFORK,  restoring  the  default
          behavior, whereby a mapping is inherited across fork(2).

   MADV_HWPOISON (since Linux 2.6.32)
          Poison the pages in the range specified by addr and  length  and
          handle  subsequent  references  to  those  pages like a hardware
          memory  corruption.   This  operation  is  available  only   for
          privileged (CAP_SYS_ADMIN) processes.  This operation may result
          in the calling process receiving a SIGBUS  and  the  page  being
          unmapped.

          This  feature  is  intended for testing of memory error-handling
          code; it is available only if the  kernel  was  configured  with
          CONFIG_MEMORY_FAILURE.

   MADV_MERGEABLE (since Linux 2.6.32)
          Enable  Kernel Samepage Merging (KSM) for the pages in the range
          specified by addr and length.  The kernel regularly scans  those
          areas of user memory that have been marked as mergeable, looking
          for pages with identical  content.   These  are  replaced  by  a
          single  write-protected page (which is automatically copied if a
          process later wants to update the content  of  the  page).   KSM
          merges only private anonymous pages (see mmap(2)).

          The  KSM feature is intended for applications that generate many
          instances of the same data (e.g., virtualization systems such as
          KVM).   It can consume a lot of processing power; use with care.
          See the Linux kernel source  file  Documentation/vm/ksm.txt  for
          more details.

          The MADV_MERGEABLE and MADV_UNMERGEABLE operations are available
          only if the kernel was configured with CONFIG_KSM.

   MADV_UNMERGEABLE (since Linux 2.6.32)
          Undo the effect of an earlier MADV_MERGEABLE  operation  on  the
          specified  address  range;  KSM  unmerges  whatever pages it had
          merged in the address range specified by addr and length.

   MADV_SOFT_OFFLINE (since Linux 2.6.33)
          Soft offline the pages  in  the  range  specified  by  addr  and
          length.   The  memory  of  each  page  in the specified range is
          preserved (i.e., when next accessed, the same  content  will  be
          visible,  but  in  a  new physical page frame), and the original
          page is offlined (i.e., no longer used, and taken out of  normal
          memory   management).    The  effect  of  the  MADV_SOFT_OFFLINE
          operation is invisible to (i.e., does not change  the  semantics
          of) the calling process.

          This  feature  is  intended for testing of memory error-handling
          code; it is available only if the  kernel  was  configured  with
          CONFIG_MEMORY_FAILURE.

   MADV_HUGEPAGE (since Linux 2.6.38)
          Enable  Transparent  Huge  Pages  (THP)  for  pages in the range
          specified by addr and length.  Currently, Transparent Huge Pages
          work  only  with  private  anonymous  pages  (see mmap(2)).  The
          kernel will  regularly  scan  the  areas  marked  as  huge  page
          candidates  to  replace  them  with huge pages.  The kernel will
          also allocate huge pages directly when the region  is  naturally
          aligned to the huge page size (see posix_memalign(2)).

          This  feature  is primarily aimed at applications that use large
          mappings of data and access large regions of that  memory  at  a
          time  (e.g.,  virtualization systems such as QEMU).  It can very
          easily waste memory (e.g., a 2MB mapping that only ever accesses
          1  byte  will  result  in 2MB of wired memory instead of one 4KB
          page).      See     the     Linux     kernel     source     file
          Documentation/vm/transhuge.txt for more details.

          The  MADV_HUGEPAGE  and MADV_NOHUGEPAGE operations are available
          only     if     the     kernel     was      configured      with
          CONFIG_TRANSPARENT_HUGEPAGE.

   MADV_NOHUGEPAGE (since Linux 2.6.38)
          Ensures  that  memory in the address range specified by addr and
          length will not be collapsed into huge pages.

   MADV_DONTDUMP (since Linux 3.4)
          Exclude from a core dump those pages in the range  specified  by
          addr and length.  This is useful in applications that have large
          areas of memory that are known not to be useful in a core  dump.
          The  effect  of MADV_DONTDUMP takes precedence over the bit mask
          that  is  set  via  the  /proc/[pid]/coredump_filter  file  (see
          core(5)).

   MADV_DODUMP (since Linux 3.4)
          Undo the effect of an earlier MADV_DONTDUMP.

   MADV_FREE (since Linux 4.5)
          The  application  no  longer  requires  the  pages  in the range
          specified by addr and len.   The  kernel  can  thus  free  these
          pages,  but  the  freeing could be delayed until memory pressure
          occurs.  For each of the pages that has been marked to be  freed
          but  has not yet been freed, the free operation will be canceled
          if  the  caller  writes  into  the  page.   After  a  successful
          MADV_FREE  operation,  any  stale  data  (i.e., dirty, unwritten
          pages) will be lost when the kernel frees the  pages.   However,
          subsequent  writes  to  pages in the range will succeed and then
          kernel cannot free those dirtied pages, so that the  caller  can
          always  see just written data.  If there is no subsequent write,
          the kernel can free the pages at any time.  Once  pages  in  the
          range  have  been freed, the caller will see zero-fill-on-demand
          pages upon subsequent page references.

          The MADV_FREE operation can be applied only to private anonymous
          pages  (see  mmap(2)).  On a swapless system, freeing pages in a
          given range happens instantly, regardless of memory pressure.

RETURN VALUE

   On success, madvise() returns zero.  On error, it returns -1 and  errno
   is set appropriately.

ERRORS

   EACCES advice  is MADV_REMOVE, but the specified address range is not a
          shared writable mapping.

   EAGAIN A kernel resource was temporarily unavailable.

   EBADF  The map exists, but the area maps something that isn't a file.

   EINVAL addr is not page-aligned or length is negative.

   EINVAL advice is not a valid.

   EINVAL advice is MADV_DONTNEED or MADV_REMOVE and the specified address
          range includes locked, Huge TLB pages, or VM_PFNMAP pages.

   EINVAL advice is MADV_MERGEABLE or MADV_UNMERGEABLE, but the kernel was
          not configured with CONFIG_KSM.

   EIO    (for  MADV_WILLNEED)  Paging  in  this  area  would  exceed  the
          process's maximum resident set size.

   ENOMEM (for MADV_WILLNEED) Not enough memory: paging in failed.

   ENOMEM Addresses  in  the  specified range are not currently mapped, or
          are outside the address space of the process.

   EPERM  advice is MADV_HWPOISON,  but  the  caller  does  not  have  the
          CAP_SYS_ADMIN capability.

VERSIONS

   Since  Linux  3.18, support for this system call is optional, depending
   on the setting of the CONFIG_ADVISE_SYSCALLS configuration option.

CONFORMING TO

   madvise() is not specified by any standards.  Versions of  this  system
   call, implementing a wide variety of advice values, exist on many other
   implementations.  Other implementations typically  implement  at  least
   the  flags  listed  above  under Conventional advice flags, albeit with
   some variation in semantics.

   POSIX.1-2001     describes     posix_madvise(3)     with      constants
   POSIX_MADV_NORMAL,       POSIX_MADV_RANDOM,      POSIX_MADV_SEQUENTIAL,
   POSIX_MADV_WILLNEED, and POSIX_MADV_DONTNEED, and so on, with  behavior
   close  to the similarly named flags listed above.  (POSIX.1-2008 adds a
   further flag, POSIX_MADV_NOREUSE, that has no analog in madvise().)

NOTES

   Linux notes
   The Linux implementation  requires  that  the  address  addr  be  page-
   aligned,  and allows length to be zero.  If there are some parts of the
   specified address range that are  not  mapped,  the  Linux  version  of
   madvise()  ignores  them  and applies the call to the rest (but returns
   ENOMEM from the system call, as it should).

SEE ALSO

   getrlimit(2), mincore(2), mmap(2),  mprotect(2),  msync(2),  munmap(2),
   prctl(2), posix_madvise(3), core(5)

COLOPHON

   This  page  is  part of release 4.09 of the Linux man-pages project.  A
   description of the project, information about reporting bugs,  and  the
   latest     version     of     this    page,    can    be    found    at
   https://www.kernel.org/doc/man-pages/.





Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.





Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.


Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.





Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.


Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.





Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.


Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.