Google Analytics and AWStats eval

From NGDC Wiki
Jump to: navigation, search

(This page derived from original at [1])

NOAA announcement for using Google Analytics, June 19, 2013

NOAA has approved the use of Google Analytics in the domain. If you like to use it internally for your site(s), preliminary guidance is provided on the NWC Google site,

To implement Google Analytics for your site, follow #2 for setup procedures. Before setting up the Google Analytics account, you must still obtain your office approval. The last step under "Technical modifications" requires you to report your use of Google Analytics by filling out a form, this information will be sent to Commerce as a social media requirement to report all sites and site contacts using this tool. It's not necessary to submit a separate application in the Commerce Social Media Application Tracking System.

To implement Google Analytics for your Google Site, follow #3 for setup procedures.

Please contact with any questions.

NGDC Policy for using Google Analytics


Comparison of Google Analytics (GA) with Analog Reports (AR)

Cool features Analog doesn't have

Google Analytics features

Typical monthly Analog Report for NGDC

  • Easy to read graphs (lovely eye-candy!) of almost anything over a user-specified time span
    • Analog provides only a month at a time. Other time ranges can be generated, but require a custom config and generation run.
  • GUI allows drill-down from overviews into details
    • Analog reports are one large, static page after they are generated. WYSIWYG -- no more, no less.

Lots more...

Can it replace our Analog web reports?

Not completely. While Google Analytics (GA) has some fabulous capabilities and a great GUI, there are some bits of data about the web servers only available in the local log files. GA requires Javascript in the HTML of the web page, which means it inherently only works when an HTML page loads successfully, and it has no access to the local server file system. Here is a typical line from the log files for a single hit: - - [31/May/2013:04:02:13 -0600] "GET /arcgis/services/hot_springs/MapServer/WmsServer
 200 3353 "-" "Spatineo Monitor GetMapBot (" 452 3517
  • Google Analytics can't know:
  1. Original file size requested ("3353") This can be very different from how much is delivered.
  2. Bytes received from client ("452")
  3. An unsuccessful server return code (e.g. 404, 503. The "200" here is successful)
  4. Anything at all about non-HTML file types, e.g. XML, JSON, CSV, PDF, tar.gz
  5. Anything at all about other protocols, such as FTP

Can anything else replace our Analog web reports?

[AWStats] in particular comes highly recommended, providing pretty much everything Analog does, but with a more friendly and interactive GUI. It also does some things I had to do manually under Analog that makes interpreting reports easier.


  • GUI for switching months (Analog requires editing the URL)
  • Mobile OS/browser detection built-in (Analog does this with custom BROWALIAS in config)
  • Built-in reporting of Chrome browser (Analog calls this "Safari/537", since Chrome is a fork of Safari)
  • Counts "human" visits (as opposed to bots)
  • Whois (site owner) and Country info for IP addresses
  • Session entry/exit/duration
  • Report screen sizes (need to add some HTML tags in index page)
  • XML output files; we could build custom reports easily with XSLT
  • Current development (latest release Jan. 2014)


  • No Directory report; i.e. no totals for /mgg and /stp

eg: [Directory Report]

  • No File Size report; i.e. no breakdown to show downloads of large data files

eg: [File Size Report]

  • Slow; about 10 times slower as reported by AWStats itself. Analog takes about 25 minutes for a monthly report, so we might expect this to increase to about 4 hours. Not a factor for the cron-generated standard reports, but this will be tedious for any custom reports we want to make (such as a report just for DMSP).

[Comparison with Analog]

  • Matching the configuration to our Analog config may require some significant effort, and is critical to be sure the ongoing numbers are comparable to those in the past. E.g. be sure to exclude the same local hosts and status check pages. Our config files are generated by an old Perl script that will have to be modified to output AWStats config files.
  • Retrospective stats for past months and years will require manual regeneration with AWStats. For Analog, we have an archive of monthly reports for each of our servers and overall NGDC going back to 2003.

Question: Is anyone (NCDC?) doing any automated parsing of the Analog report output to report web stats up the NOAA chain of command?

Here's a good list of other possibilities: