Lab 4: Advanced System and process monitoring¶
Objectives¶
After completing this lab, you will be able to
- view and manage processes using advanced tools
- diagnose and debug system calls
- view and set process priority using advanced CLI tools
- view and set custom scheduling policies for processes
- analyzing system and application performance
Estimated time to complete this lab: 90 minutes
Introduction¶
The commands in this Lab cover a broader spectrum of process management, system monitoring, and resource control in Linux. They add more depth and variety to your System Administrator repertoire.
These exercises cover the additional Linux commands and concepts, providing hands-on experience for process management, resource monitoring, and advanced control.
Exercise 1¶
fuser¶
The fuser
command in Linux is used to identify processes using files or sockets. It can be a useful aid in file-related process management and conflict resolution.
To create a script to simulate file usage¶
First, create an empty test file we want to access. Type:
touch ~/testfile.txt
Create the script that we will use to simulate access to testfile.txt. Type:
cat > ~/simulate_file_usage.sh << EOF #!/bin/bash tail -f ~/testfile.txt EOF
Make the script executable. Type:
chmod +x ~/simulate_file_usage.sh
Launch the script. Type:
~/simulate_file_usage.sh &
To identify processes accessing a file¶
Identify Processes using or accessing
testfile.txt
, run:fuser ~/testfile.txt
Explore additional
fuser
options using the-v
option. Type:fuser -v ~/testfile.txt
All done with testfile.txt and simulate_file_usage.sh. You can now remove the files. Type:
kill %1 rm ~/testfile.txt ~/simulate_file_usage.sh
To identify a process Accessing a TCP/UDP Port¶
Use the
fuser
command to identify the process of accessing the TCP port 22 on your server. Type:sudo fuser 22/tcp
Exercise 2¶
perf
¶
perf
is a versatile tool for analyzing system and application performance in Linux. It can offer extra insights that can aid performance tuning.
To install perf
¶
Install the
perf
application if it is not installed on your server. Type:sudo dnf -y install perf
The
bc
application is a command-line precision calculator.bc
will be used in this exercise to simulate high CPU load. Ifbc
is not already installed on your server, install it with:sudo dnf -y install bc
To create a script to generate CPU load¶
Create a CPU Load Script and make it executable by running:
cat > ~/generate_cpu_load.sh << EOF #!/bin/bash # Check if the number of decimal places is passed as an argument if [ "$#" -ne 1 ]; then echo "Usage: $0 <number_of_decimal_places>" exit 1 fi # Calculate Pi to the specified number of decimal places for i in {1..10}; do echo "scale=$1; 4*a(1)" | bc -l; done EOF chmod +x ~/generate_cpu_load.sh
Tip
The generate_cpu_load.sh script is a simple tool for generating CPU load by calculating Pi (π) to high precision. The same calculation is done 10 times. The script accepts an integer as the parameter for specifying the number of decimal places for calculating Pi.
To simulate extra CPU load¶
Let's run a simple test and calculate Pi to 50 decimal places. Run the Script by typing:
~/generate_cpu_load.sh 50 &
Rerun the script, but use
perf
to record the script's performance to analyze CPU usage and other metrics. Type:./generate_cpu_load.sh 1000 & perf record -p $! sleep 5
Tip
The
sleep 5
option with theperf record
command defines the time window forperf
to collect performance data about the CPU load generated by the generate_cpu_load.sh script. It allows `perf to record system performance metrics for 5 seconds before automatically stopping.
To analyze performance data and monitor real-time events¶
Use the
perf report
command to review the performance data report to understand the CPU and memory utilization patterns. Type:sudo perf report
You can use various keyboard keys to explore the report further. Type q to exit/quit the
perf
report viewer interface.Observe/capture real-time CPU cache events for 40 seconds to identify potential performance bottlenecks. Type:
sudo perf stat -e cache-references,cache-misses sleep 40
To record the system's comprehensive performance¶
Capture system-wide performance data that can be used for extra analysis. Type:
sudo perf record -a sleep 10
Explore specific event counters. Count specific events like CPU cycles to evaluate the performance of a given script or application. Let's test with a basic
find
command, type:sudo perf stat -e cycles find /proc
Do the same thing but with the ./generate_cpu_load.sh script. Count specific events like CPU cycles to evaluate the performance of the ./generate_cpu_load.sh script. Type:
sudo perf stat -e cycles ./generate_cpu_load.sh 500
OUTPUT:
...<SNIP>... 3.141592653589793238462643383279502884197169399375105820974944592307\ 81640628620899862803482534211..... Performance counter stats for './generate_cpu_load.sh 500': 1,670,638,886 cycles 0.530479014 seconds time elapsed 0.488580000 seconds user 0.034628000 seconds sys
Note
Here's the breakdown of the final sample output of the
perf stat
command:1,670,638,886 cycles: This indicates the total number of CPU cycles consumed during the execution of the script. Each cycle represents a single step in the CPU's instruction execution.
0.530479014 seconds time elapsed: This is the total elapsed real-world time (or wall-clock time) from the start to the end of the script execution. This duration includes all types of waits (like waiting for disk I/O or system calls).
0.488580000 seconds user: This is the CPU time spent in user mode. This time excludes explicitly time spent doing system-level tasks.
0.034628000 seconds sys: This is the CPU time spent in the kernel or system mode. This includes the time the CPU spends executing system calls or performing other system-level tasks on behalf of the script.
All done with
perf
tool. Ensure that any background scripts are for a clean working environment.kill %1
Exercise 3¶
strace
¶
strace
is used for diagnosing and debugging system call interactions in Linux.
To create a script for exploring strace
¶
Create a simple script named
strace_script.sh
and make it executable. Type:cat > ~/strace_script.sh << EOF #!/bin/bash while true; do date sleep 1 done EOF chmod +x ~/strace_script.sh
To use strace
on running processes¶
Run the script and attach
strace
. Type:~/strace_script.sh &
Find the PID for the
strace_script.sh
process in a separate terminal. Store the PID in a variable named MYPID. We'll use thepgrep
command for this by running:export MYPID=$(pgrep strace_script) ; echo $MYPID
OUTPUT:
4006301
Start tracing the system calls of the script to understand how it interacts with the kernel. Attach
strace
to the running script process by typing:sudo strace -p $MYPID
Detach or stop the
strace
process by typing Ctrl+CThe
strace
output can be filtered by focusing on specific system calls such asopen
andread
to analyze their behavior. Try doing this for theopen
andread
system calls. Type:sudo strace -e trace=open,read -p $MYPID
When you are done trying to decipher the
strace
output, stop thestrace
process by typing Ctrl+CRedirect the output to a file for later analysis, which can help diagnose issues. Save
strace
output to a file by running:sudo strace -o strace_output.txt -p $MYPID
To analyze the frequency of system calls¶
Summarize the system call counts to identify the most frequently used system calls by the process. Do this for only 10 seconds by appending the
timeout
command. Type:sudo timeout 10 strace -c -p $MYPID
Our sample system shows a summary report output like this:
OUTPUT:
strace: Process 4006301 attached strace: Process 4006301 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 89.59 0.042553 1182 36 18 wait4 7.68 0.003648 202 18 clone 1.67 0.000794 5 144 rt_sigprocmask 0.45 0.000215 5 36 rt_sigaction 0.36 0.000169 9 18 ioctl 0.25 0.000119 6 18 rt_sigreturn ------ ----------- ----------- --------- --------- ---------------- 100.00 0.047498 175 270 18 total
Terminate the script and remove any files created.
kill $MYPID rm ~/strace_script.sh ~/strace_output.txt
Exercise 4¶
atop
¶
atop
provides a comprehensive view of system performance, covering various resource metrics.
To launch and explore atop
¶
Install the
atop
application if it is not installed on your server. Type:sudo dnf -y install atop
Run
atop
by typing:sudo atop
Within the
atop
interface, you can explore variousatop
metrics by pressing specific keys on your keyboard.Use 'm', 'd', or 'n' to switch between memory, disk, or network views. Observe how resources are being utilized in real time.
Monitor system performance at a custom interval of 2 seconds, allowing a more granular view of system activity. Type:
sudo atop 2
Switch between different resource views to focus on specific aspects of system performance.
Generate a log file report for system activity, capturing data every 60 seconds, three times. Type:
sudo atop -w /tmp/atop_log 60 3
Once the previous command is completed, you can take your time and review the binary file that the logs were saved to. To read back the saved log file, type:
sudo atop -r /tmp/atop_log
Clean up by removing any logs or files generated.
sudo rm /tmp/atop_log
Exercise 5¶
numactl
¶
It is a computer memory design/architecture used in multiprocessing that enhances memory access speed by considering the physical location of memory about processors. In NUMA-based systems, multiple processors (or CPU cores) are physically grouped, and each group has its local memory.
The numactl
application manages NUMA policy, optimizing performance on NUMA-based systems.
To install numactl
¶
Install the
numactl
application if it is not installed on your server. Type:sudo dnf -y install numactl
To create a memory-intensive script¶
Create a simple script to help simulate a memory-intensive workload on your server. Type:
cat > ~/memory_intensive.sh << EOF #!/bin/bash awk 'BEGIN{for(i=0;i<1000000;i++)for(j=0;j<1000;j++);}{}' EOF chmod +x ~/memory_intensive.sh
To use numactl
¶
Run the script with
numactl
, type:numactl --membind=0 ~/memory_intensive.sh
If your system has more than one NUMA node available, you can run the script on multiple NUMA nodes via:
numactl --cpunodebind=0,1 --membind=0,1 ~/memory_intensive.sh
Show Memory Allocation on NUMA Nodes
numactl --show
Bind memory to a specific node by running:
numactl --membind=0 ~/memory_intensive.sh
Clean up your working environment by removing the script.
rm ~/memory_intensive.sh
Exercise 6¶
iotop
¶
The iotop
command monitors disk I/O (input/output) usage by processes and threads. It provides real-time information similar to the top
command, specifically for disk I/O. This makes it essential for diagnosing system slowdowns caused by disk activity.
To install iotop
¶
Install the
iotop
utility if it is not installed. Type:sudo dnf -y install iotop
To use iotop
to monitor disk I/O¶
Run the
iotop
command without any options to use in its default interactive mode. Type:sudo iotop
Observe the live disk I/O usage by various processes. Use this to identify processes currently reading from or writing to the disk.
Type q to quit or exit
iotop
.
To use iotop
in non-interactive mode¶
Run
iotop
in batch mode (-b) to get a non-interactive, one-shot view of I/O usage. The-n 10
option tellsiotop
to take 10 samples before exiting.sudo iotop -b -n 10
iotop
can filter I/O for specific processes. Identify a process ID (PID) from your system using the ps command or theiotop
display. Then, filteriotop
output for that specific PID. For example filter for the PID for thesshd
process, by running:sudo iotop -p $(pgrep sshd | head -1)
The -
o
option withiotop
can be used for showing processes or threads doing actual I/O, instead of displaying all processes or threads. Display only I/O Processes by running:sudo iotop -o
Discussion
Discuss the impact of disk I/O on overall system performance and how tools like
iotop
can aid in system optimization.
Exercise 7¶
cgroups
¶
Control Groups (cgroups
) provide a mechanism in Linux to organize, limit and prioritize the resource usage of processes.
This exercise demonstrates direct interaction with the cgroup
v2 filesystem.
To explore the cgroup
filesystem¶
Use the
ls
command to explore the contents and structure of thecgroup
filesystem. Type:ls /sys/fs/cgroup/
Use the
ls
command again to list the *.slice folders under thecgroup
filesystem. Type:ls -d /sys/fs/cgroup/*.slice
The folders with the .slice suffix are typically used in
systemd
to represent a slice of system resources. These are standardcgroups
managed bysystemd
for organizing and managing system processes.
To create a custom cgroup
¶
Create a directory named "exercise_group" under the /sys/fs/cgroup file system. This new folder will house the control group structures needed for the rest of this exercise. Use the
mkdir
command by typing:sudo mkdir -p /sys/fs/cgroup/exercise_group
List the files and directories under the /sys/fs/cgroup/exercise_group structure. Type:
sudo ls /sys/fs/cgroup/exercise_group/
The output shows the files and directories automatically created by the
cgroup
subsystem to manage and monitor the resources for thecgroup
.
To set a new memory resource limit¶
Let's set a memory resource limit to limit memory usage to 4096 bytes (4kB). To restrict processes in the
cgroup
to use a maximum of 4kB of memory type:echo 4096 | sudo tee /sys/fs/cgroup/exercise_group/memory.max
Confirm Memory Limit has been set. Type:
cat /sys/fs/cgroup/exercise_group/memory.max
To create the memory_stress test script¶
Create a simple executable script using the
dd
command to test the memory resource limit. Type:bash cat > ~/memory_stress.sh << EOF #!/bin/bash dd if=/dev/zero of=/tmp/stress_test bs=10M count=2000 EOF chmod +x ~/memory_stress.sh
To run and add process/script to the memory cgroup
¶
Launch the memory_stress.sh, capture its PID and add the PID to cgroup.procs. Type:
~/memory_stress.sh & echo $! | sudo tee /sys/fs/cgroup/exercise_group/cgroup.procs
The /sys/fs/cgroup/exercise_group/cgroup.procs file can be used for adding or viewing the PIDs (Process IDs) of processes that are members of a given
cgroup
. Writing a PID to this file assigns the ~/memory_stress.sh script process to the exercise_groupcgroup
.The previous command will end very quickly before completion because it has exceeded the memory limits of the
cgroup
. You can run the followingjournalctl
command in another terminal to view the error as it happens. Type:journalctl -xe -f | grep -i memory
Tip
You can quickly use the ps command to check the approximate memory usage of a process if you know the PID of the process by running:
pidof <PROCESS_NAME> | xargs ps -o pid,comm,rss
This output should show the Resident Set Size (RSS) in KB, indicating the memory used by the specified process at a point in time. Whenever the RSS value of a process exceeds the memory limit specified in
cgroup's
memory.max value, the process may be subject to memory management policies enforced by the kernel or thecgroup
itself. Depending on the system configuration, the system may take actions such as throttling the process's memory usage, killing the process, or triggering an out-of-memory (OOM) event.
To set a new CPU resource limit¶
Restrict the script to use only 10% of a CPU core. Type:
echo 10000 | sudo tee /sys/fs/cgroup/exercise_group/cpu.max
10000 represents the CPU bandwidth limit. It's set to 10% of a single CPU core's total capacity.
Confirm CPU Limit has been set. Type:
cat /sys/fs/cgroup/exercise_group/cpu.max
To create the CPU stress test script¶
Create and set executable permissions for a script to generate high CPU usage. Type:
cat > ~/cpu_stress.sh << EOF #!/bin/bash exec yes > /dev/null EOF chmod +x ~/cpu_stress.sh
Note
yes > /dev/null
is a simple command that generates a high CPU load.
To run and add a process/script to the CPU cgroup
¶
Run the script and immediately add its PID to the
cgroup
, by typing:~/cpu_stress.sh & echo $! | sudo tee /sys/fs/cgroup/exercise_group/cgroup.procs
To confirm process CPU usage resource control¶
Check the CPU usage of the process.
pidof yes | xargs top -b -n 1 -p
The output should show the real-time CPU usage of the yes process. The %CPU for yes should be limited per the
cgroup
configuration (e.g., around 10% if the limit is set to 10000).Set and experiment with other values for cpu.max for the exercise_group
cgroup
and then observe the effect every time you rerun the ~/cpu_stress.sh script within the control group.
To identify and select the primary storage device¶
The primary storage device can be a target for setting I/O resource limits. Storage devices on Linux systems have major and minor device numbers that can be used to identify them uniquely.
First, let's create a set of helper variables to detect and store the device number for the primary storage device on the server. Type:
primary_device=$(lsblk | grep disk | awk '{print $1}' | head -n 1) primary_device_num=$(ls -l /dev/$primary_device | awk '{print $5, $6}' | sed 's/,/:/')
Display the value of the $primary_device_num variable. Type:
echo "Primary Storage Device Number: $primary_device_num"
The major and minor device numbers should match what you see in this ls output:
ls -l /dev/$primary_device
To set a new I/O resource limit¶
Set the I/O operations to 1 MB/s for read and write processes under the exercise_group
cgroup
. Type:echo "$primary_device_num rbps=1048576 wbps=1048576" | \ sudo tee /sys/fs/cgroup/exercise_group/io.max
Confirm I/O limits set. Type:
cat /sys/fs/cgroup/exercise_group/io.max
To create the I/O stress test process¶
Start a
dd
process to create a large file named /tmp/io_stress. Also, capture and store the PID of thedd
process in a variable namedMYPID
. Type:dd if=/dev/zero of=/tmp/io_stress bs=10M count=500 oflag=dsync \ & export MYPID=$!
To add a process/script to the I/O cgroup
¶
Add the PID of the previous
dd
process to the exercise_group controlcgroup
. Type:echo $MYPID | sudo tee /sys/fs/cgroup/exercise_group/cgroup.procs
To confirm process I/O usage resource control¶
Check the I/O usage of the process by executing:
iotop -p $MYPID
The output will display I/O read/write speeds for the io_stress.sh process, which should not exceed 1 MB/s as per the limit.
To remove cgroups
¶
Type the following commands to end any background process, delete the no-longer-needed
cgroup
and remove the /tmp/io_stress file.kill %1 sudo rmdir /sys/fs/cgroup/exercise_group/ sudo rm -rf /tmp/io_stress
Exercise 8¶
taskset
¶
CPU affinity binds specific processes or threads to particular CPU cores in a multi-core system. This exercise demonstrates the use of taskset
to set or retrieve the CPU affinity of a process in Linux.
To explore CPU Affinity with taskset
¶
Use the
lscpu
to list available CPUs on your system. Type:lscpu | grep "On-line"
Let's create a sample process using the dd utility and store its PID in a MYPID variable. Type:
dd if=/dev/zero of=/dev/null & export MYPID="$!" echo $MYPID
Retrieve current affinity for the
dd
process. Type:taskset -p $MYPID
OUTPUT:
pid 1211483's current affinity mask: f
The output shows the CPU affinity mask of the process with a PID of 1211483 ($MYPID), represented in hexadecimal format. On our sample system, the affinity mask displayed is "f", which typically means that the process can run on any CPU core.
Note
The CPU affinity mask "f" represents a configuration where all CPU cores are enabled. In hexadecimal notation, "f" corresponds to the binary value "1111". Each bit in the binary representation corresponds to a CPU core, with "1" indicating that the core is enabled and available for the process to run on.
Therefore, on four core CPU, with the mask "f":
Core 0: Enabled Core 1: Enabled Core 2: Enabled Core 3: Enabled
To set/change CPU affinity¶
Set the CPU affinity of the dd process to a single CPU (CPU 0). Type:
taskset -p 0x1 $MYPID
OUTPUT
pid 1211483's current affinity mask: f pid 1211483's new affinity mask: 1
Verify the change by running the following:
taskset -p $MYPID
The output indicates the CPU affinity mask of the process with PID $MYPID. The affinity mask is "1" in decimal, which translates to "1" in binary. This means that the process is currently bound to CPU core 0.
Now, set the CPU affinity of the dd process to multiple CPUs (CPUs 0 and 1). Type:
taskset -p 0x3 $MYPID
Issue the correct
tasksel
command to verify the latest change.taskset -p $MYPID
On our demo 4-core CPU server, the output shows that the CPU affinity mask of the process is "3" (in decimal). This translates to "11" in binary.
Tip
Decimal "3" is "11" (or 0011) in binary. Each binary digit corresponds to a CPU core: core 0, core 1, core 2, core 3 (from right to left). The digit "1" in the fourth and third positions (from the right) indicates that the process can run on cores 0 and 1. Therefore, "3" signifies that the process is bound to CPU cores 0 and 1.
Launch either the
top
orhtop
utility in a separate terminal and observe if you see anything of interest as you experiment with differenttaskset
configurations for a process.All done. Use its PID ($MYPID) to kill the
dd
process.
Exercise 9¶
systemd-run
¶
The systemd-run
command creates and starts transient service units for running commands or processes. It can also run programs in transient scope units, path-, socket-, or timer-triggered service units.
This exercise shows how to use systemd-run
for creating transient service units in systemd
.
To run a command as a transient service¶
Run the simple sleep 300 command as a transient
systemd
service usingsystemd-run
. Type:systemd-run --unit=mytransient.service --description="Example Service" sleep 300
Check the status of the transient service using
systemctl status
. Type:systemctl status mytransient.service
To set a memory resource limit for a transient service¶
Use the
--property
parameter withsystemd-run
to limit the maximum memory usage for the transient process to 200M. Type:systemd-run --unit=mylimited.service --property=MemoryMax=200M sleep 300
Look under the corresponding
cgroup
file system for the process to verify the setting. Type:sudo cat /sys/fs/cgroup/system.slice/mytransient.service/memory.max
Tip
systemd.resource-control
is a configuration or management entity (concept) within thesystemd
framework designed for controlling and allocating system resources to processes and services. Andsystemd.exec
is asystemd
component responsible for defining the execution environment in which commands are executed. To view the various settings (properties) you can tweak when using systemd-run consult thesystemd.resource-control
andsystemd.exec
manual pages. This is where you will find documentation for properties like MemoryMax, CPUAccounting, IOWeight, etc.
To set CPU resource limit for a transient service¶
Let's create a transient
systemd
unit called "myrealtime.service". Run myrealtime.service with a specific round robin (rr) scheduling policy and priority. Type:systemd-run --unit=myrealtime.service \ --property=CPUSchedulingPolicy=rr --property=CPUSchedulingPriority=50 sleep 300
View the status for myrealtime.service. Also, capture/store the main [sleep] PID in a MYPID variable. Type:
MYPID=$(systemctl status myrealtime.service | awk '/Main PID/ {print $3}')
Verify its CPU scheduling policy While the service is still running. Type:
chrt -p $MYPID pid 2553792's current scheduling policy: SCHED_RR pid 2553792's current scheduling priority: 50
To create a transient timer unit¶
Create a simple timer unit that runs a simple echo command. The
--on-active=2m
option sets the timer to trigger 2 minutes after the timer unit becomes active. Type:systemd-run --on-active=2m --unit=mytimer.timer \ --description="Example Timer" echo "Timer triggered"
The timer will start counting down from the time the unit is activated, and after 2 minutes, it will trigger the specified action.
View details/status for the timer unit that was just created. Type:
systemctl status mytimer.timer
To stop and clean up transient systemd
units¶
Type the following commands to ensure that the various transient services/processes started for this exercise are properly stopped and removed from your system. Type:
systemctl stop mytransient.service systemctl stop mylimited.service systemctl stop myrealtime.service systemctl stop mytimer.timer
Exercise 10¶
schedtool
¶
This exercise demonstrates the use of schedtool
to understand and manipulate process scheduling in Rocky Linux. We will also create a script to simulate a process for this purpose.
To install schedtool
¶
Install the
schedtool
application if it is not installed on your server. Type:sudo dnf -y install schedtool
To create a simulated process script¶
Create a script that generates CPU load for testing purposes. Type:
cat > ~/cpu_load_generator.sh << EOF #!/bin/bash while true; do openssl speed > /dev/null 2>&1 openssl speed > /dev/null 2>&1 done EOF chmod +x ~/cpu_load_generator.sh
Start the script in the background. Type:
~/cpu_load_generator.sh & echo $!
Capture the PID for the main
openssl
process launched within the cpu_load_generator.sh script. Store the PID in a variable named MYPID. Type:export MYPID=$(pidof openssl) ; echo $MYPID
To use schedtool
to check the current scheduling policy¶
Use the
schedtool
command to display the scheduling information of the process with PID $MYPID. Type:schedtool $MYPID
OUTPUT:
PID 2565081: PRIO 0, POLICY N: SCHED_NORMAL , NICE 0, AFFINITY 0xf
To use schedtool
to modify the scheduling policy¶
Change the scheduling policy and priority of the process FIFO and 10, respectively. Type:
sudo schedtool -F -p 10 $!
View the effect of the changes. Type:
schedtool $MYPID
Change the scheduling policy and priority of the process to round robin or SCHED_RR (RR) and 50, respectively. Type:
sudo schedtool -R -p 50 $MYPID
View the effect of the changes. Type:
schedtool $MYPID
Change the scheduling policy of the process to Idle or SCHED_IDLEPRIO (D). Type:
sudo schedtool -D $MYPID
View the effect of the changes.
Finally, reset the scheduling policy of the process back to the original default SCHED_NORMAL (N or other). Type:
sudo schedtool -N $MYPID
To terminate and clean up the cpu_load_generator.sh
process¶
All done. Terminate the script and delete the
cpu_load_generator.sh
script.kill $MYPID rm ~/cpu_load_generator.sh
Author: Wale Soyinka
Contributors: Steven Spencer, Ganna Zhrynova