Wednesday, April 29, 2015

How to schedule background jobs using crontab

The cron daemon is a great user tool to automate tasks that don't require human intervention. Users pre-specify jobs to run in the background at particular times, for example, every Monday, Wednesday and Friday at 2am.

To use cron, each user creates his own crontab ('cron table') file. The command to examine one's crontab file is crontab -l.

$ crontab -l
MAILTO=commandolinux@gmail.com
0 2 * * 1,3,5 /home/peter/backups/backupWP.sh 2>&1

The MAILTO line specifies the email address to which cron sends the output of command execution. Please refer to my earlier post on how to set up an SMTP server to forward your emails.

The second crontab line specifies that the backupWP.sh script should be executed at 2am every Monday, Wednesday and Friday. The syntax may look complicated. Fortunately, you can use the on-line Crontab Generator to craft the crontab statements. If you want to learn the syntax, click here instead.

Create crontab

Your crontab file is initially empty. To create the file from scratch, run the crontab command and type in the crontab statements.

$ crontab

Alternatively, put the statements into a temporary file, say /tmp/cron, and run this command:

$ cat /tmp/cron | crontab -

Edit crontab

If you want to modify crontab contents after they are created, run this command:

$ crontab -e

The command opens the crontab file in your default text editor. It is the most versatile way to modify crontab. You can use it to create, modify, and delete crontab statements. Don't forget to save the file after you finish editing.

The downside for this edit command is the time and overhead of starting the text editor. You can append a new statement directly by using the command in the next section.

Add to crontab

When I was new to crontab, I made the mistake of trying to append a statement by running crontab without any argument. That actually replaced everything in the crontab file with the new input.

The correct command to append a new statement is:

$ (crontab -l; echo "30 04 * * 4 /home/peter/backups/report.sh 2>&1") | crontab -

The trick is to run 2 commands in a subshell grouped by the round brackets. The first command, crontab -l, fetches the existing crontab statements. The echo command echoes the new statement to be appended. The collective output from both commands are piped to crontab standard input.

Empty crontab

To erase all crontab contents, execute the following command:

$ crontab -r

Conclusion

You may use crontab to schedule regular maintenance and backup tasks. Once it is set up, the crontab file tends to be static. But, if you ever need to add another task, or change the scheduled times, the commands introduced in this post will come in handy.

Tuesday, April 28, 2015

Free edX course on Introduction to Java Programming

If you want to learn Java programming, here is your perfect opportunity to take an on-line course.

EdX, the popular MOOC (Massive Open On-line Course) provider, is hosting a course on Introduction to Programming with Java.

This is part 1 of a series of Java courses, and will take 5 weeks to complete on-line.

I have never taken a formal programming course on-line. So, I can't advise on the effectiveness of such a course. But, I've taken non-programming-related edX courses before, and the experience was positive.

Do sign up today. The course starts on April 28.

And let us know in a comment of how you find the course.

Thursday, April 23, 2015

Configuring Monit: a free system monitoring and recovery tool

Why Monit?

One morning, I went on-line to check my WordPress website. Lo and behold, I saw this error: 'Error establishing a database connection.' My website had been down for 4 hours, luckily in the middle of the night.

I used a free website monitoring service called StatusCake. Sure enough, it did send me an email alerting me about this problem. But, sending an email at 2am was not helpful in solving the problem. What I really needed was a tool that not only detected when the database process went down, but would also restart the process without human intervention. Monit is such a tool.

For the rest of this post, I assume you want Monit to monitor a LAMP server (Linux, Apache2, MySQL, PHP).

Install Monit.

To install Monit on Debian or Ubuntu, execute this command:

$ sudo apt-get install monit

As part of the installation, a monit service is created:

$ sudo chkconfig --list | grep -i monit  
monit       0:off  1:off  2:on   3:on   4:on   5:on   6:off

Configure Monit

The main Monit configuration file is /etc/monit/monitrc. To edit it, you need sudo privileges.

$ sudo vi /etc/monit/monitrc

After you make a change to the file, follow these steps to bring it into effect:

  1. Validate configuration file syntax.

    $ sudo monit -t
    

    If no error is returned, proceed to next step.

  2. Restart Monit.

    $ sudo service monit restart
    

Global settings

The key global settings to customize are:

  • Test interval

    By default, Monit checks your system at 2-minute intervals. To customize the interval, change the value (from 120) in the following statement. The unit of measure is seconds.

    set daemon 120
    
  • Log file location

    You can specify whether Monit logs to syslog or a log file of your choice.

    # set logfile syslog facility log_daemon  
    set logfile /var/log/monit.log
    
  • Mail server

    Specify a mail server for Monit to send email alerts. I set up exim4 as an SMTP server on the localhost. For instructions, refer to my previous post.

    set mailserver localhost
    
  • Email format

    Hopefully, you won't receive many alert emails, but when you do, you want the maximum information about the potential problem. The default email format contains all the information known to Monit, but you may customize the format in which the information is delivered. To customize, use the set mail-format statement.

    set mail-format {  
        from:     monit@$HOST  
        subject:  monit alert --  $EVENT $SERVICE  
        message:  $EVENT Service $SERVICE  
                  at $DATE  
                  on $HOST 
                  $ACTION  
                  $DESCRIPTION  
                  Your faithful employee,  
                  Monit  
    }
    

    For a description of the set mail-format statement, click here.

  • Global alerts

    If any actionable event occurs, Monit sends an email alert to a predefined address list. Each email address is defined using the set alert statement.

    set alert root@localhost not on { instance, action }
    

    In the above example, root@localhost is the email recipient. Please refer to my earlier post about redirecting local emails to a remote email account.

    Note that an event filter is defined (not on { instance, action }). Root@local will receive an email alert on every event unless it is of the instance or action type. An instance event is triggered by the starting or stopping of the Monit process. An action event is triggered by certain explicit user commands, e.g., to unmonitor or monitor a service. Click here for the complete list of event types that you can use for filtering.

    By default, Monit sends an email alert when a service fails and another when it recovers. It does not repeat failure alerts after the initial detection. You can change this default behavior by specifying the reminder option in the set alert statement. The following example sends a reminder email on every fifth test cycle if the target service remains failed:

    set alert root@localhost with reminder on 5 cycles
    
  • Enabling reporting and service management

    You can dynamically manage Monit service monitors, and request status reports. These capabilities are delivered by an embedded web server. By default, this web server is disabled. To enable it, include the set httpd statement.

    set httpd port 2812 and
        use address localhost  
        allow localhost
    

    Note: I've only allowed local access to the embedded web server. The Useful Commands section below explains the commands to request reporting and management services.

Resource monitor settings

The following are the key resources to monitor on a LAMP server.

  • System performance

    You can configure Monit to send an alert when system resources are running below certain minimum performance threshold. The system resources that can be monitored are load averages, memory, swap and CPU usages.

    check system example.com   
      if loadavg (1min) > 4       then alert 
      if loadavg (5min) > 2       then alert 
      if memory usage   > 75%     then alert 
      if swap usage     > 25%     then alert 
      if cpu usage (user)   > 70% then alert 
      if cpu usage (system) > 30% then alert 
      if cpu usage (wait)   > 20% then alert
    
  • Filesystem usage

    You can create a monitor which is triggered when the percentage of disk space used is greater than an upper threshold.

    check filesystem rootfs with path /
      if space usage > 90% then alert
    

    You may have more than 1 filesystem created on your server. Run the df command to identify the filesystem name (rootfs) and the path it was mounted on (/).

  • MySQL

    Instead of putting the MySQL-specific statements in the main configuration file, I elect to put them in /etc/monit/conf.d/mysql.conf. This is a personal preference. I like a more compact main configuration file. All files inside the /etc/monit/conf.d/ directory are automatically included in Monit configuration.

    The following statements should be inserted into the mysql.conf file.

    check process mysql with pidfile /var/run/mysqld/mysqld.pid  
          start program = "/etc/init.d/mysql start"  
          stop program = "/etc/init.d/mysql stop"  
          if failed unixsocket /var/run/mysqld/mysqld.sock then restart  
          if 5 restarts within 5 cycles then timeout
    

    If the MySQL process dies, Monit needs to know how to restart it. The command to start the MySQL process is specified by the start program clause. The command to stop MySQL is specified by the stop command clause.

    A timeout event is triggered if MySQL is restarted 5 times in a span of 5 consecutive test cycles. In the event of a timeout, an alert email is sent, and the MySQL process will no longer be monitored. To resume monitoring, execute this command:

    $ sudo monit monitor mysql
    
  • Apache

    I put the following Apache-specific statements in the file /etc/monit/conf.d/apache.conf.

    check process apache2 with pidfile /var/run/apache2.pid
          start program = "/etc/init.d/apache2 start"
          stop program = "/etc/init.d/apache2 stop"
          if failed host example.com port 80 protocol http request "/monit/token" then restart
          if 3 restarts within 5 cycles then timeout
          if children > 250 then restart
          if loadavg(5min) greater than 10 for 8 cycles then stop
    

    At every test cycle, Monit attempts to retrieve http://example.com/monit/token. This URL points to a dummy file created on the webserver specifically for this test. You need to create the file by executing the following commands:

    $ mkdir /var/www/monit
    $ touch /var/www/monit/token 
    $ chown -R www-data:www-data /var/www/monit
    

    Besides testing web access, the above configuration also monitors resource usages. The Apache process is restarted if it spawns more than 250 child processes. Apache is also restarted if the server's load average is greater than 10 for 8 cycles.

Useful commands

To print a status summary of all services being monitored, execute the command below:

    $ sudo monit summary  
    The Monit daemon 5.4 uptime: 3h 48m 

    System 'example.com'                Running
    Filesystem 'rootfs'                 Accessible
    Process 'mysql'                     Running
    Process 'apache2'                   Running

To print detailed status information of all services being monitored, execute the following:

    $ sudo monit status
    The Monit daemon 5.4 uptime: 3h 52m 

    System 'example.com'
      status                            Running
      monitoring status                 Monitored
      load average                      [0.00] [0.01] [0.05]
      cpu                               0.0%us 0.0%sy 0.0%wa
      memory usage                      377092 kB [74.0%]
      swap usage                        53132 kB [10.3%]
      data collected                    Wed, 22 Apr 2015 13:21:47
    ...        
    Process 'apache2'
      status                            Running
      monitoring status                 Monitored
      pid                               12909
      parent pid                        1
      uptime                            6d 15h 18m 
      children                          10
      memory kilobytes                  2228
      memory kilobytes total            335420
      memory percent                    0.4%
      memory percent total              65.9%
      cpu percent                       0.0%
      cpu percent total                 0.0%
      port response time                0.001s to example.com:80/my-monit-dir/token [HTTP via TCP]
      data collected                    Wed, 22 Apr 2015 13:21:47

To unmonitor a particular service (e.g., apache2):

    $ sudo monit unmonitor apache2

To unmonitor all services:

    $ sudo monit unmonitor all

To monitor a service:

    $ sudo monit monitor apache2

To monitor all services:

    $ sudo monit monitor all

Conclusion

I'd recommend that you run Monit on your server in addition to signing up for a remote website monitoring service such as StatusCake. While the 2 services do overlap, they also complement each other. Monit runs locally on your server, and can restart processes when a problem is detected. However, a networking problem may go undetected by Monit. That is where a remote monitoring service shines. In the event of a network failure, the remote monitor fails to connect to your server, and will therefore report a problem that may otherwise go unnoticed.

Tuesday, April 14, 2015

Command-line network speed testing

Is your web app slow? Is network bandwidth the problem? To diagnose the problem, begin by measuring the network bandwidth. Many users run the popular, web-based speedtest.net to capture speed performance data. This is a good solution if the X Window System is installed on the webserver. However, I have a Linux VPS server without an X graphical environment. Command line is the only viable way to perform a speed test on that server.

Power Linux users may want to use the iperf program to measure network bandwidth. To use iperf effectively, you need some basic knowledge of TCP/IP. In addition, you need to setup iperf to run on 2 machines: the 'client' and the 'server'. Yet, if you like the simplicity of using speedtest.net, you will be happy to know the following command-line tool to access speedtest.net servers.

speedtest-cli is a command-line Python program for testing Internet bandwidth using speedtest.net.

To download, and configure speedtest-cli, run the following commands:

$ wget -O speedtest-cli https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest_cli.py
$ chmod +x speedtest-cli

To capture the upload and download speeds of a local machine, you can simply run speedtest-cli without any parameter. The program automatically selects the 'best' speedtest.net server to test bidirectional transmission from the local machine.

$ ./speedtest-cli
Retrieving speedtest.net configuration...
Retrieving speedtest.net server list...
Testing from Telus Communications (108.180.199.xxx)...
Selecting best server based on latency...
Hosted by TELUS (Vancouver, BC) [3.73 km]: 34.739 ms
Testing download speed........................................
Download: 3.04 Mbit/s
Testing upload speed..................................................
Upload: 0.74 Mbit/s

In the above example, the program selected a test server located only 3 kilometers away from the local machine. That is not where most of my web visitors are from, namely the east coast of United States. The speed tests are more useful to me if the test server is located say in New York city.

You can designate a specific speedtest.net server in your speed testing. First, list the supported test servers.

$ ./speedtest-cli --list
...
982) Interserver, inc (Secaucus, NJ, United States) [3895.32 km]
2947) Atlantic Metro (New York City, NY, United States) [3903.25 km]
663) Optimum Online (New York City, NY, United States) [3903.25 km]
...

Then, select one from the list to specify as the test server, say 2947 (Atlantic Metro in New York City). To track network speed performance more consistently over time, you can designate the same test server in your subsequent tests.

$ ./speedtest-cli --server 2947
Retrieving speedtest.net configuration... Retrieving speedtest.net server list...
Testing from Telus Communications (108.180.199.xxx)...
Hosted by Atlantic Metro (New York City, NY) [3903.25 km]: 2629.346 ms
Testing download speed........................................
Download: 2.79 Mbit/s
Testing upload speed..................................................
Upload: 0.84 Mbit/s

For more information about speedtest-cli parameters, execute the command below.

$ ./speedtest.cli -h

Monday, April 6, 2015

How to n-up pages in a PDF or PPT file via the command-line

Suppose you downloaded a PowerPoint or a PDF file from slideshare. You liked it so much that you wanted to print it out. But, alas, it was 50 pages long.

This tutorial introduces the command-line tools to n-up a PPT or PDF file, i.e., batch multiple pages of the input file onto a single page on the output file. The output file is of the PDF format.

To 2-up a file, you place 2 original pages on a single output page. Similarly, to 4-up a file, 4 original pages on a single output page. By n-upping a file, you drastically reduce the number of pages for printing.

Convert to PDF

If the original file is a PowerPoint file (PPT, PPTX, PPS, PPSX), you need to first convert it to PDF. The tool I use is unoconv.

To install unoconv on Debian,

$ sudo apt-get install unoconv

To convert input.ppt to input.pdf,

$ unoconv -f pdf input.ppt

N-up PDF

Now that you have a PDF file, use the pdfnup program to n-up the file.

To install pdfnup,

$ sudo apt-get install pdfjam

Behind the scene, pdfnup uses the TeX typesetting system to do the n-up conversion. So, you need to first install some LaTeX-related packages.

$ sudo apt-get install texlive-latex-base texlive-latex-recommended

Now, you are ready to execute the following command to n-up input.pdf.

$ pdfnup --nup 2x3 --paper letter --frame true --no-landscape input.pdf

  • --nup 2x3: 2x3 means 2 columns and 3 rows. This houses a total of 6 input pages on each output page.

  • --paper letter: The default paper size is A4. For North Americans, specify --paper letter for the US letter size.

  • --frame: By default, the subpages on the output page are not framed, i.e., there are no borders around each subpage. To specify that a frame should be drawn around each subpage, specify --frame true.

  • --no-landscape: The default page orientation is landscape. If you want the portrait orientation, specify --no-landscape.

  • The output PDF filename for the above example is input-nup.pdf. The output filename is constructed by appending the default suffix -nup to the input filename.

The above method is not the only way to n-up a PDF file. Below is an alternative method that involves first converting the PDF file to PostScript format, then doing the n-up, and finally converting it back to PDF.

$ pdf2ps input.pdf input.ps
$ psnup -2 input.ps output.ps
$ ps2pdf output.ps output.pdf

You can choose either method to do the n-up conversion. I generally avoid the PostScript method because it involves an extra conversion step. Regardless of which method you choose, the environment will thank you for using less paper.