Some quick vizualizations of California's drought water usage data

For those of you just joining us, this post is a follow up to this piece articulating the untapped potential to use data to deal with California's drought and this ontological exploration of California's 8 month old statewide water usage database stemming from the drought.

Apparently, I'm a glutton for punishment so I decided to whip up a few quick vizualizations to better articulate the information in the current State Water Resources Control Board ("SWRCB") statewide drought usage database.

First, some caveats on the SWRCB data: this urban usage data is only from 2013 and 2014 and isn't normalized by weather changes, population growth or economic growth.  It also doesn't capture household level variation , which speaks to the need for a robust secure granular water usage and conservation database.

That said, I created a few side by side histograms to illustrate how the count of water districts' per capita per day water use varies by month, year and mandatory restriction status.  The months aren't showing chronologically for some reason -- a quirk it seems in pandas datetime that I don't have time to dig into the source code for.

We can do the same thing for 2013.

I also was curious to see how population related to water usage.  In particular, I was curious if any sort of "scaling law" held here since those are a much debated topic here at CUSP.  So I created a few log-log plots of population / per capita water use and population / total use. Note the y axis is the log of gallons per capita per data per water utility (logGPCD2014/2013) or the log of total water usage (logTotal2013/2014) and the x axis is the log of the population in the utilities serve area.

[Further note the scaling law idea is generally explored in cities, and California's water utilities emphatically do not map to urban boundaries defined by human habitation i.e. what you see from looking out the window of a plane or even Metropolitan Statistical Areas.]

Note the downward trend line with a slope of about -.1 so "sublinear" scaling.  The R^2 is only about ~.05 though so the relationship is pretty weak -- again what you'd expect from the existing urban scaling laws since these are for utility service areas and not cities.  

If I was more of a glutton for punishment I'd put together a table of per capita usage along city or MSA boundaries and if I was really a glutton for punishment I'd do it using this nifty clustering algorithm to determine urban boundaries.  

Please leave any questions or feedback in the comments. 



EDITED: an earlier version of this post contained graphs with unit conversion errors that have subsequently been removed.

Print Friendly and PDF