shell script to count lines of code

At work, was asked to count lines of code for a web development project using specific requirements -- only include certain directories and files, exclude files in .git directories, don't report blank (empty) lines or commented lines and include files which aren't always clearly identified as belonging to a particular language. Strictly speaking, it's not counting just LOC, but lines that aren't blank or commented, which in this case translates to a fairly accurate LOC count.

UPDATE: Since writing my little script, I've come across other tools which are much, much more sophisticated, and would likely be a better fit if your use case is anything but very simple:

At the moment, this script supports single line comments using the following comment characters and strings:

#
//
/* ... */
<!-- ... -->

As well as single and multi line comments using the following comment strings:

/* 
 ...
*/
<!--
...
-->

The should cover most shell scripts, Perl, PHP, C, CSS, JS, HTML. Adding more should be trivial.

One issue with this script is that if a multiline comment starts at the end of a line containing code, that line of code won't get counted. This is shown in the "test.script.bad" example below.

Script in action:

$ get.line.counts.sh
=====================================
Counting lines in files
=====================================
0	empty file, ugly name
0	test.script.bad
3	test.script.good
-------------------------------------
Total lines: 3
 
=====================================
Counting lines in dirs
=====================================
3	test.dir/test.script.good
-------------------------------------
Total lines: 3

The above example used the following test files:

$ find -type f
./empty file, ugly name
./test.script.bad
./test.script.good
./test.dir/test.script.good

And here's what the test files contained:

empty file, ugly name:

1
 

test.script.bad:

1
2
3
a bit of code /* mixed with
multi line
comment */

test.script.good:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# this is a single line comment
    # single line comment with leading spaces
//another single line comment
 
 this is a
   bit of amazing
 code
 
 /* this is
 a multi line
 comment */
 
/* this is a single line comment */
 
   <!-- single line comment -->
 
  <!-- this is
  a multi line
  comment -->

test.dir/test.script.good:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# this is a single line comment
    # single line comment with leading spaces
//another single line comment
 
 this is a
   bit of amazing
 code
 
 /* this is
 a multi line
 comment */
 
/* this is a single line comment */
 
   <!-- single line comment -->
 
  <!-- this is
  a multi line
  comment -->

And here's the script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#!/bin/bash
 
files="
empty file, ugly name
test.script.bad
test.script.good
"
 
dirs="
test.dir/
"
 
Count()
{
  lcc=$(sed -r '/^$/d;/^([ ]+)?\/\//d;/^([ ]+)?#/d' "$f")
  lcn=$(echo "$lcc" | wc -l)
  ml1=$(echo "$lcc" | awk '/\/\*/,/\*\// {++ml1} END {print ml1+0}')
  ml2=$(echo "$lcc" | awk '/<!--/,/-->/  {++ml2} END {print ml2+0}')
  tot=$(( $lcn - $ml1 - $ml2  ))
  echo -e "$tot\t$f"
}
 
countF()
{
 echo "$files" | sed '/^$/d' |
 while read f; do
   Count
 done
}
 
countD()
{
 find $dirs -type f | sed '/\/.git\//d' |
 while read f; do
   Count
 done
}
 
totalF=$(countF | awk '{sum+=$1} END {print "Total lines: " sum}')
totalD=$(countD | awk '{sum+=$1} END {print "Total lines: " sum}')
 
if [ -n "$files" ]; then
  echo =====================================
  echo Counting lines in files
  echo =====================================
  countF
  echo -------------------------------------
  echo $totalF
  echo 
fi
 
if [ -n "$dirs" ]; then
  echo =====================================
  echo Counting lines in dirs
  echo =====================================
  countD
  echo -------------------------------------
  echo $totalD
fi

Leave a comment

NOTE: Enclose quotes in <blockquote></blockquote>. Enclose code in <pre lang="LANG"></pre> (where LANG is one of these).