Geo functions
geolocation
Commentary
added in 0.2.5
Generates real geolocations in a variety of formats. This generator requires downloading an external data set. Please see below.
Getting the data sets
This generator is powered using public domain data. For each country you want to generate locations in, you need the corresponding data set. Download it to your machine and follow the rest of the instructions.
Configuring ShadowTraffic
To give ShadowTraffic access to the geolocation data, you need to do two things.
First, volume mount the data set into the ShadowTraffic container. You can put it in whatever path you wish. For example:
docker run -v $(pwd)/YourDataSet.csv:/home/YourDataSet.csv ...
Second, add the file to your globalConfigs
in your ShadowTraffic file, under the corresponding country name. Be sure to use the path you mounted to inside the ShadowTraffic container. For example, to add geolocation data for the United States:
{
"generators": [
],
"globalConfigs": {
"geolocation": {
"countryFiles": {
"United States": "/home/YourDataSet.csv"
}
}
},
"connections": {
}
}
Invoking the generator
By default, when you execute this generator, it will choose geolocations anywhere within the specified country.
You can, however, set a number of other narrowing criteria depending on the chosen country. For example, to narrow locations within the United States, you can set the state
, city
, and zipCode
parameters.
All narrowing parameters take the form of an array, and the search OR
s the elements together. For example, supplying the parameter "state": ["TX", "NY"]
searches for locations in Texas or New York.
If you supply multiple narrowing parameters, like state
and city
, the search AND
s the parameters together. If you added "city": ["Austin", "New York City"]
to the previous example, it would only return locations in Austin of Texas or New York City of New York.
All search criteria must match the underlying data set. ShadowTraffic doesn't alter it's capitalization, whitespace, etc.
Output formats
This generator can output geolocation data in a variety of formats by setting the format
parameter.
Address
If not explicitly set, the address
format is used, which generates a complete address string according to the country's mailing convention. For example, in the United States, that might be: 72 Oak St, Rochester NY 14602
.
Coordinates
Setting format
to coordinates
generates maps of the form:
{
"latitude": 30.1853900100001,
"longitude": -97.888961035
}
Object
Setting format
to object
generates maps of structured data. The specification of this structure depends on the country you're generating data for.
Multiples
Setting format
to an array of formats will return a map whose keys are the format names and whose values are the formatted locations. 1
Caching
When you run ShadowTraffic and a particular geolocation generator for the first time, the supporting data will be loaded from scratch. Depending on the size of the data set, this can take a little bit.
After that, all subsequent runs of that generator will be cached in an embedded, on-disk database.
Changing any search criteria, like city
, will force a reload of the data.
To improve development cycles, it's recommended that you preserve that cache by mounting another volume into your host container to the /tmp/geolocations
path. For example, you might run ShadowTraffic like this:
docker run -v $(pwd)/geolocations:/tmp ...
If for some reason you need to wipe out the cache, just delete your local mount.
Examples
Generating US addresses
Generate string addresses in the United States.
{
"_gen": "geolocation",
"country": "United States"
}
Generating Texas addresses
Use state
, city
, and other parameters to narrow the location candidatres. This examples generates geolocations only in Austin and Houston in the state of Texas.
{
"_gen": "geolocation",
"country": "United States",
"state": [
"TX"
],
"city": [
"Austin",
"Houston"
]
}
Generating multiple formats
If you set format
to an array, geolocation
will return an object of format
-> data for the location. For example, if you requested object
and address
, each event would return an map with two keys: one object-version and one string address-version of the same location.
{
"topic": "locations",
"value": {
"_gen": "geolocation",
"country": "United States",
"format": [
"object",
"address"
]
}
}
Specification
JSON schema
{
"type": "object",
"properties": {
"country": {
"type": "string",
"enum": [
"United States"
]
},
"state": {
"type": "array",
"minItems": 1,
"items": {
"type": "string"
}
},
"city": {
"type": "array",
"minItems": 1,
"items": {
"type": "string"
}
},
"zipCode": {
"type": "array",
"minItems": 1,
"items": {
"type": "string"
}
},
"format": {
"oneOf": [
{
"type": "string",
"enum": [
"address",
"coordinates",
"object"
]
},
{
"type": "array",
"items": {
"type": "string",
"enum": [
"address",
"coordinates",
"object"
]
},
"minItems": 1
}
]
}
},
"required": [
"country"
]
}