A step-by-step guide for writing a Linux device driver for an ultrasonic sensor on a Raspberry Pi - Part 2

A step-by-step guide for writing a Linux device driver for an ultrasonic sensor HC-SR04 on a Raspberry Pi - Part 2

Welcome back to the second and last part of the tutorial on how to write a Linux device for a ultrasonic range sensor HC-SR04 using a Raspberry Pi.

In case you did not read the first part make sure to catch up here.

In the first part I covered the electric circuit and hardware interface the sensor is providing. Then we cross-compiled a hello world kernel module for a Raspberry Pi. Afterwards we extended the module and registered a character device which will act as the interface between the kernel module and user space applications. We also discussed major and minor device numbers and created a script for loading the kernel module which also creates the filesystem node with the right major number.

In this second part of the tutorial we will cover the next topics:

  1. Setting up output GPIO lines in order to trigger a measurement
  2. Registering an interrupt on a GPIO line for rising and falling edges
  3. Measuring time intervals in the kernel and calculating the range in mm
  4. Synchronization of an interrupt handler with the read-syscall by waiting on an event
  5. Handling concurrency in the Linux device driver

The code to this tutorial can be found on GitHub.

If you want to read more about Linux device drivers I can highly recommend the free book Linux Device Drivers, Third Edition. Another great book for getting an overview about the Linux kernel is Understanding the Linux Kernel, Third Edition.

Triggering a measurement via GPIOs

The first step we are going to implement is the triggering of a measurement of the sensor. Triggering a measurement is done by setting the trigger pin of the ultrasonic sensor to a high level for at least 10 microseconds as it can be seen in the timing diagram below.

Timing diagram of an HC-SR04 ultrasonic sensor showing the trigger and echo signals

For the trigger pin we are going to use GPIO number 26 but any GPIO number can be used. It can be configured with #define GPIO_TRIGGER at the top of the source code. First we will setup and initialize the GPIO line in the module_init function. In the module_exit function we also need to free the GPIO line again. The actual triggering of a measurement shall be done in the character device read()-operation. That means a user space application can trigger a measurement by reading from the file /dev/hc-sr04 and can therefore decide if measurements shall be done cyclically or only when a specific events happens.

Setting up a GPIO line requires these steps:

  1. Use gpio_is_valid(int number) to verify if the GPIO number is e.g. positive and in the range of available GPIOs for the given architecture.
  2. Request the GPIO with gpio_request(unsigned gpio, const char *label). This will call the Linux pinctrl framework which is responsible for assigning the right function to a pin. A pin can usually have multiple functions. It can be used as a digital GPIO line or have alternative functions like I2C, SPI, UART, etc. The pinctrl framework is responsible for assigning functions to pins and e.g. making sure there are no conflicting assignments. The pinctrl framework is used by the gpiolib. The gpio_request will configure the pin as a digital GPIO and return 0 in case of success and a negative error number in case of failure.
  3. Setting the direction of the GPIO is done with either int gpio_direction_input(unsigned gpio) or int gpio_direction_output(unsigned gpio, int value). The two functions are self-explaining.
  4. Setting the value for output pins to either high or low is done with the void gpio_set_value(unsigned gpio, int value) function. 0 and 1 can be passed to the value argument for low and high level.

To sum it up in the module's init function we have for the trigger GPIO:

    if (gpio_is_valid(GPIO_TRIGGER) == false)
    {
        pr_err("[HC-SR04]: GPIO %d is not valid\n", GPIO_TRIGGER);
        return -1;
    }

    if (gpio_request(GPIO_TRIGGER, "GPIO_TRIGGER") < 0)
    {
        pr_err("[HC-SR04]: ERROR: GPIO %d request\n", GPIO_TRIGGER);
        gpio_free(GPIO_TRIGGER);
        return -1;
    }

    gpio_direction_output(GPIO_TRIGGER, 0);
    gpio_set_value(GPIO_TRIGGER, 0);

In the module's exit function we also need to free the GPIO again:

static void hc_sr04_exit(void)
{
    ...
    gpio_set_value(GPIO_TRIGGER, 0);
    gpio_free(GPIO_TRIGGER);
    ...
}

In the read function we can trigger a measurement now by setting the output to high, wait for at least 10 microseconds and set the pin back to low again. For the delay of 10 microseconds we will use udelay(unsigned long usecs).

The function is found in <linux/delay.h>. There are various delay functions provided by the Linux kernel you can choose from and which are working in different ways. One way is e.g. busy waiting which is also used by the udelay function. Other functions may cause a sleep and allow the scheduler to switch to another process. Choosing the right function is dependent on the amount of time you need to delay and on the context the code is running in (e.g. interrupt or process). It would not make sense to allow sleeping if you only want to have a delay of one microsecond. The amount of overhead would be simply too much. More details and advise can be found in the kernel documentation. In a nutshell we can trigger a measurement like this inside the read-function:

// Trigger measurement
gpio_set_value(GPIO_TRIGGER, 1);
udelay(10); // A minimum period of 10us is needed to trigger a measurement
gpio_set_value(GPIO_TRIGGER, 0);

Registering an interrupt handler on the input GPIO

After the measurement the HC-SR04 sensor will set the echo pin to high for a time which is proportional to the measured distance. In the device driver this time must be measured starting from the rising edge and ending at the falling the edge. For that purpose we will register an interrupt handler on both rising and falling edge. This is done using the Linux interrupt framework. First we'll configure the GPIO_ECHO pin as an input pin, retrieve an interrupt number from the gpiolib function gpio_to_irq and afterwards we will register a function in the device driver as an interrupt handler.

// .. inside module_init
    gpio_irq_number = gpio_to_irq(GPIO_ECHO);
    if (request_irq(gpio_irq_number,           
                  (void *)gpio_echo_irq_handler,   
                  IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,  // Handler will be called in rising and falling edge
                  "hc-sr04",              
                  NULL)) {                 
        pr_err("[HC-SR04]: Cannot register interrupt number: %d\n", gpio_irq_number);
        return -1;
    }

Registering an interrupt handler is done with the request_irq function. Our interrupt handler function has the signature shown below. The interrupt number is passed first when the handler is called. The second argument is a void pointer. In the request_irq we passed NULL to the last argument but we could have a passed a pointer containing some client data. This could be useful if our device driver would handle multiple instances and in the interrupt handler function we want to distinguish which of the devices triggered the interrupt. In our case we only handle one device and therefore we do not use it.

irqreturn_t gpio_echo_irq_handler(int irq, void *dev_id);

In the module_exit function we need to make sure to free the interrupt number again since the CPU has a limited number of interrupts using the free_irq function.

Measuring time intervals in the kernel and calculating the range in mm

After registering our interrupt handler the handler is called on rising and falling edges. So first we will query the current GPIO value (high or low) with the gpio_get_value function to determine if a rising or falling edge triggered the interrupt. If the value is high it was a rising edge, if the current value is low a falling edge triggered the interrupt.

In order to measure the time period of the high level we will use the Linux kernel's timer functions in order to retrieve the current time in nanoseconds. For that purpose we will have to global variables of type ktime_t. This type is defined in include/linux/ktime.h and represents a time in nanoseconds.

volatile ktime_t ktime_start, ktime_end;

On a rising edge we will update ktime_start with the current time and on a falling edge we will update ktime_end. Since we are measuring only a time interval and not a coordinate point in time (UTC) we will use the function ktime_t ktime_get(). It returns a time that starts at boot time and increases monotonically which is suited for measuring time intervals. You can read more about it in ktime accessors.

At a later point in the device driver we will simply subtract ktime_start and ktime_end using the helper function ktime_sub and derive the range in millimeters from the elapsed time using the formula:

Range = High Level Time * 340 m/s / 2

Since the accuracy of the sensor is only about 3 millimeters the result will be calculated in millimeters as well. In order to avoid floating point operations and also keep the precision loss in the integer divisions as low as possible we will first convert the nanoseconds to microseconds then multiply by 340 and do the divisions afterwards.

ktime_t elapsed_time = ktime_sub(ktime_end, ktime_start);
unsigned int range_mm = ((unsigned int) ktime_to_us(elapsed_time)) * 340u / 2u / 1000u;

Synchronization of the interrupt handler with the read operation

Now that we have the two building blocks that we can trigger a measurement and we have a interrupts for rising and falling edges, we need to synchronize the two procedures. Triggering a measurement is done in the read-function which is executed in process context, i.e. when a user-space process issues a read-syscall on the device /dev/hc-sr04. From the read-function we want to return the measured distance. However, the interrupts are executed asynchronously and are not part of the thread of execution of the read-function. Therefore the read-function cannot immediately satisfy the request and for that purpose we need to synchronize both. This is done by waiting on an event inside the read-function which means that it shall sleep until a condition becomes true.

The following sequence diagram shall illustrate the described issue:

Sequence diagram for synchronizing the read-syscall with the interrupt by using an event

In order to sleep and wait for an event inside the read-function a data structure called wait queue is needed. It's a queue containing the processes that are waiting for a specific event. We first need to declare a wait_queue_head_t and initialize it using the init_waitqueue_head function. The type wait_queue_head_t is declared in <linux/wait.h>. We will also need to declare a condition that is tested after a wake up. For that we declare a volatile int condition_echo. We will set condition_echo to 1 before waking up the sleeping process and set it back to 0 after waking up.

// Global
wait_queue_head_t wait_for_echo;
volatile int condition_echo;

// ... in module_init
init_waitqueue_head(&wait_for_echo);

For waiting inside the read-function we are going to use the wait_event_interruptible_timeout(queue, condition, timeout) function. We are going to use the timeout variant, so that if something is going wrong and the interrupt for rising and falling edge is not triggered, we do not wait forever and block. The timeout is specified in jiffies. jiffies describe the number of timer interrupt counts in the kernel. Every time a timer interrupt in the kernel happens an internal counter called jiffies_64 is incremented. So in order to transform a jiffies value into seconds it must be divided with the constant HZ which is the number of jiffies per second. We are going to set the timeout to 100 milliseconds and therefore need to pass HZ / 10 to the wait_event_interruptible_timeout function.

// .. inside read-function

// Reset condition for waking up
condition_echo = 0;

// Trigger measurement
gpio_set_value(GPIO_TRIGGER, 1);
udelay(10); // A minimum period if 10us is needed to trigger a measurement
gpio_set_value(GPIO_TRIGGER, 0);
    
// Wait until the interrupt for the falling edge happened
// A timeout of 100ms is configured 
remaining_delay = wait_event_interruptible_timeout(wait_for_echo, condition_echo, HZ / 10);

Inside the interrupt handler triggered for the falling edge we want to wake up the sleeping process. This is done using wake_up_interruptible which is the counterpart function to the wait_event_interruptible_timeout function. Before waking up we must set the condition to true since the wait-function will test if the condition evaluates to true.

// .. inside interrupt handler
// Falling edge -> store time and wakeup read-function
else if (gpio_value == 0) {
    ktime_end = ktime_get();       
          
    condition_echo = 1;
    wake_up_interruptible(&wait_for_echo);
}

After the wait-function is released now we can calculate the measured range as described above and return the value in millimeters to the user process.

Handling concurrency in the Linux device driver

The final topic we will discuss is handling concurrency in the device driver. Let's first take a look on why need to care about this topic.

In Linux and on a SMP (Symmetric Multi-Processing) system multiple user processes may open the character device (like /dev/hc-sr04) simultaneously and also perform read operations using the read syscall. This could lead to data races on one hand side since the read-operation could be executed on multiple cores and also it can lead to issues because there is hardware that is controlled by the device driver. Accessing hardware simultaneously (e.g. triggering a new measurement) would not make sense. To address this concern the driver should implement locking mechanisms to ensure that only one process can access the device at a time. This can be achieved through the use of mutexes or spinlocks.

In our case we want to address two cases:

  1. Only one user process shall be allowed to open the device at a time
  2. Inside one user process multiple threads could issue a read-call. Only one thread shall be allowed to issue a read-call since in the read function a measurement is triggered

For the first part we are going to use an atomic variable so we avoid the need of locking with a mutex or semaphore. Locking always comes with a risk to introduce data races. Our method is that we setup an atomic integer variable with an initial value of -1. When the open-syscall operation is issued we will increment and test the atomic variable using the atomic_inc_and_test function. The function will increment the atomic integer variable by one and return true if the result is zero. That means in case no process has opened the device yet the result is true. If the device was already opened the atomic integer will be greater than zero and return false. In that case we return the -EBUSY error number. In the release function of the character device the atomic variable is reset to -1. With this method we ensure that only one process will open the device at the same time.

atomic_t opened = ATOMIC_INIT(-1);

int hc_sr04_open(struct inode *inode, struct file *filp) {
    // Allow only one process to open the device at the same time
    if (atomic_inc_and_test(&opened)) {
        return 0;
    }
    else {
        return -EBUSY;
    }
}

int hc_sr04_release(struct inode *inode, struct file *filp) {
    atomic_set(&opened, -1);
	return 0;
}

For the second issue that only one thread shall issue a read-call at the same time we are going to use a semaphore for mutual exclusion. The difference to the synchronization we used before with the atomic variable is that with the semaphore the process is put to sleep and waits until the semaphore is released again. For acquiring the semaphore down_interruptible is used and the up function for releasing the semaphore. The critical section covers almost the whole read-function. Another guard we will build in is throttling the temporal distance between two measurements to 60 milliseconds as recommended in the sensor's datasheet. For that purpose, the time of the last measurement is stored and it is checked that the elapsed time is at least 60 milliseconds. If that's the case the semaphore is released immediately and the error number -EBUSY is returned to indicate to the user process that the sensor is not available yet.

What we have learned

In this post we continued with the basic device driver structure from part 1 and finally started to communicate with the sensor hardware. Programming a Linux device driver can be daunting but we broke it down into manageable steps:

We set up the output GPIO lines to trigger a measurement inside the read-function. Also we registered an interrupt handler routine for rising and falling edges on the echo pin of the sensor. Inside the interrupt handler we measured the current time in nanoseconds. Since the interrupt handler is executed in an asynchronous way we also delved into the topic of synchronization using an event to wait on and put our process to sleep. Finally we also covered the topic of concurrency to avoid that multiple processes access the shared resource at the same time.

In the end here is the link to the repository again containing the code for this tutorial. Feel free to also have a look on the sample applications in the repository in C++ and Python that show how to read the measured values from the Linux device driver periodically.