Achieve insights from historic location knowledge utilizing Amazon Location Service and AWS analytics providers


Many organizations world wide depend on using bodily property, akin to autos, to ship a service to their end-customers. By monitoring these property in actual time and storing the outcomes, asset house owners can derive precious insights on how their property are getting used to constantly ship enterprise enhancements and plan for future modifications. For instance, a supply firm working a fleet of autos might have to determine the influence from native coverage modifications exterior of their management, such because the introduced enlargement of an Extremely-Low Emission Zone (ULEZ). By combining historic automobile location knowledge with info from different sources, the corporate can devise empirical approaches for higher decision-making. For instance, the corporate’s procurement crew can use this info to make selections about which autos to prioritize for alternative earlier than coverage modifications go into impact.

Builders can use the assist in Amazon Location Service for publishing gadget place updates to Amazon EventBridge to construct a near-real-time knowledge pipeline that shops places of tracked property in Amazon Easy Storage Service (Amazon S3). Moreover, you should use AWS Lambda to counterpoint incoming location knowledge with knowledge from different sources, akin to an Amazon DynamoDB desk containing automobile upkeep particulars. Then an information analyst can use the geospatial querying capabilities of Amazon Athena to achieve insights, such because the variety of days their autos have operated within the proposed boundaries of an expanded ULEZ. As a result of autos that don’t meet ULEZ emissions requirements are subjected to a every day cost to function throughout the zone, you should use the placement knowledge, together with upkeep knowledge akin to age of the automobile, present mileage, and present emissions requirements to estimate the quantity the corporate must spend on every day charges.

This publish exhibits how you should use Amazon Location, EventBridge, Lambda, Amazon Information Firehose, and Amazon S3 to construct a location-aware knowledge pipeline, and use this knowledge to drive significant insights utilizing AWS Glue and Athena.

Overview of answer

It is a absolutely serverless answer for location-based asset administration. The answer consists of the next interfaces:

  • IoT or cell software – A cell software or an Web of Issues (IoT) gadget permits the monitoring of an organization automobile whereas it’s in use and transmits its present location securely to the information ingestion layer in AWS. The ingestion method just isn’t in scope of this publish. As a substitute, a Lambda operate in our answer simulates pattern automobile journeys and instantly updates Amazon Location tracker objects with randomized places.
  • Information analytics – Enterprise analysts collect operational insights from a number of knowledge sources, together with the placement knowledge collected from the autos. Information analysts are in search of solutions to questions akin to, “How lengthy did a given automobile traditionally spend inside a proposed zone, and the way a lot would the charges have value had the coverage been in place over the previous 12 months?”

The next diagram illustrates the answer structure.
Architecture diagram

The workflow consists of the next key steps:

  1. The monitoring performance of Amazon Location is used to trace the automobile. Utilizing EventBridge integration, filtered positional updates are printed to an EventBridge occasion bus. This answer makes use of distance-based filtering to cut back prices and jitter. Distanced-based filtering ignores location updates through which units have moved lower than 30 meters (98.4 ft).
  2. Amazon Location gadget place occasions arrive on the EventBridge default bus with supply: ["aws.geo"] and detail-type: ["Location Device Position Event"]. One rule is created to ahead these occasions to 2 downstream targets: a Lambda operate, and a Firehose supply stream.
  3. Two totally different patterns, based mostly on every goal, are described on this publish to reveal totally different approaches to committing the information to a S3 bucket:
    1. Lambda operate – The primary method makes use of a Lambda operate to reveal how you should use code within the knowledge pipeline to instantly rework the incoming location knowledge. You’ll be able to modify the Lambda operate to fetch further automobile info from a separate knowledge retailer (for instance, a DynamoDB desk or a Buyer Relationship Administration system) to counterpoint the information, earlier than storing the ends in an S3 bucket. On this mannequin, the Lambda operate is invoked for every incoming occasion.
    2. Firehose supply stream – The second method makes use of a Firehose supply stream to buffer and batch the incoming positional updates, earlier than storing them in an S3 bucket with out modification. This methodology makes use of GZIP compression to optimize storage consumption and question efficiency. You may as well use the knowledge transformation function of Information Firehose to invoke a Lambda operate to carry out knowledge transformation in batches.
  4. AWS Glue crawls each S3 bucket paths, populates the AWS Glue database tables based mostly on the inferred schemas, and makes the information obtainable to different analytics functions by means of the AWS Glue Information Catalog.
  5. Athena is used to run geospatial queries on the placement knowledge saved within the S3 buckets. The Information Catalog supplies metadata that enables analytics functions utilizing Athena to search out, learn, and course of the placement knowledge saved in Amazon S3.
  6. This answer features a Lambda operate that constantly updates the Amazon Location tracker with simulated location knowledge from fictitious journeys. The Lambda operate is triggered at common intervals utilizing a scheduled EventBridge rule.

You’ll be able to take a look at this answer your self utilizing the AWS Samples GitHub repository. The repository incorporates the AWS Serverless Software Mannequin (AWS SAM) template and Lambda code required to check out this answer. Discuss with the directions within the README file for steps on methods to provision and decommission this answer.

Visible layouts in some screenshots on this publish could look totally different than these in your AWS Administration Console.

Information technology

On this part, we talk about the steps to manually or mechanically generate journey knowledge.

Manually generate journey knowledge

You’ll be able to manually replace gadget positions utilizing the AWS Command Line Interface (AWS CLI) command aws location batch-update-device-position. Change the tracker-name, device-id, Place, and SampleTime values with your individual, and be sure that successive updates are greater than 30 meters in distance aside to put an occasion on the default EventBridge occasion bus:

aws location batch-update-device-position --tracker-name <tracker-name> --updates "[{"DeviceId": "<device-id>", "Position": [<longitude>, <latitude>], "SampleTime": "<YYYY-MM-DDThh:mm:ssZ>"}]"

Robotically generate journey knowledge utilizing the simulator

The offered AWS CloudFormation template deploys an EventBridge scheduled rule and an accompanying Lambda operate that simulates tracker updates from autos. This rule is enabled by default, and runs at a frequency specified by the SimulationIntervalMinutes CloudFormation parameter. The information technology Lambda operate updates the Amazon Location tracker with a randomized place offset from the autos’ base places.

Automobile names and base places are saved within the autos.json file. A automobile’s beginning place is reset every day, and base places have been chosen to present them the flexibility to float out and in of the ULEZ on a given day to offer a sensible journey simulation.

You’ll be able to disable the rule quickly by navigating to the scheduled rule particulars on the EventBridge console. Alternatively, change the parameter State: ENABLED to State: DISABLED for the scheduled rule useful resource GenerateDevicePositionsScheduleRule within the template.yml file. Rebuild and re-deploy the AWS SAM template for this variation to take impact.

Location knowledge pipeline approaches

The configurations outlined on this part are deployed mechanically by the offered AWS SAM template. The knowledge on this part is offered to explain the pertinent components of the answer.

Amazon Location gadget place occasions

Amazon Location sends gadget place replace occasions to EventBridge within the following format:

{
    "model":"0",
    "id":"<event-id>",
    "detail-type":"Location Gadget Place Occasion",
    "supply":"aws.geo",
    "account":"<account-number>",
    "time":"<YYYY-MM-DDThh:mm:ssZ>",
    "area":"<area>",
    "sources":[
        "arn:aws:geo:<region>:<account-number>:tracker/<tracker-name>"
    ],
    "element":{
        "EventType":"UPDATE",
        "TrackerName":"<tracker-name>",
        "DeviceId":"<device-id>",
        "SampleTime":"<YYYY-MM-DDThh:mm:ssZ>",
        "ReceivedTime":"<YYYY-MM-DDThh:mm:ss.sssZ>",
        "Place":[
            <longitude>, 
            <latitude>
	]
    }
}

You’ll be able to optionally specify an enter transformation to change the format and contents of the gadget place occasion knowledge earlier than it reaches the goal.

Information enrichment utilizing Lambda

Information enrichment on this sample is facilitated by means of the invocation of a Lambda operate. On this instance, we name this operate ProcessDevicePosition, and use a Python runtime. A customized transformation is utilized within the EventBridge goal definition to obtain the occasion knowledge within the following format:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Place":[<Longitude>,<Latitude>]
}

You would apply further transformations, such because the refactoring of Latitude and Longitude knowledge into separate key-value pairs if that is required by the downstream enterprise logic processing the occasions.

The next code demonstrates the Python software logic that’s run by the ProcessDevicePosition Lambda operate. Error dealing with has been skipped on this code snippet for brevity. The complete code is out there within the GitHub repo.

import json
import os
import uuid
import boto3

# Import atmosphere variables from Lambda operate.
bucket_name = os.environ["S3_BUCKET_NAME"]
bucket_prefix = os.environ["S3_BUCKET_LAMBDA_PREFIX"]

s3 = boto3.shopper("s3")

def lambda_handler(occasion, context):
    key = "%s/%s/%s-%s.json" % (bucket_prefix,
                                occasion["DeviceId"],
                                occasion["SampleTime"],
                                str(uuid.uuid4())
    physique = json.dumps(occasion, separators=(",", ":"))
    body_encoded = physique.encode("utf-8")
    s3.put_object(Bucket=bucket_name, Key=key, Physique=body_encoded)
    return {
        "statusCode": 200,
        "physique": "success"
    }

The previous code creates an S3 object for every gadget place occasion acquired by EventBridge. The code makes use of the DeviceId as a prefix to put in writing the objects to the bucket.

You’ll be able to add further logic to the previous Lambda operate code to counterpoint the occasion knowledge utilizing different sources. The instance within the GitHub repo demonstrates enriching the occasion with knowledge from a DynamoDB automobile upkeep desk.

Along with the prerequisite AWS Identification and Entry Administration (IAM) permissions offered by the position AWSBasicLambdaExecutionRole, the ProcessDevicePosition operate requires permissions to carry out the S3 put_object motion and every other actions required by the information enrichment logic. IAM permissions required by the answer are documented within the template.yml file.

{
    "Model":"2012-10-17",
    "Assertion":[
        {
            "Action":[
                "s3:ListBucket"
            ],
            "Useful resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>"
            ],
            "Impact":"Enable"
        },
        {
            "Motion":[
                "s3:PutObject"
            ],
            "Useful resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>/<S3_BUCKET_LAMBDA_PREFIX>/*"
            ],
            "Impact":"Enable"
        }
    ]
}

Information pipeline utilizing Amazon Information Firehose

Full the next steps to create your Firehose supply stream:

  1. On the Amazon Information Firehose console, select Firehose streams within the navigation pane.
  2. Select Create Firehose stream.
  3. For Supply, select as Direct PUT.
  4. For Vacation spot, select Amazon S3.
  5. For Firehose stream identify, enter a reputation (for this publish, ProcessDevicePositionFirehose).
    Create Firehose stream
  6. Configure the vacation spot settings with particulars in regards to the S3 bucket through which the placement knowledge is saved, together with the partitioning technique:
    1. Use <S3_BUCKET_NAME> and <S3_BUCKET_FIREHOSE_PREFIX> to find out the bucket and object prefixes.
    2. Use DeviceId as a further prefix to put in writing the objects to the bucket.
  7. Allow Dynamic partitioning and New line delimiter to verify partitioning is computerized based mostly on DeviceId, and that new line delimiters are added between data in objects which are delivered to Amazon S3.

These are required by AWS Glue to later crawl the information, and for Athena to acknowledge particular person data.
Destination settings for Firehose stream

Create an EventBridge rule and connect targets

The EventBridge rule ProcessDevicePosition defines two targets: the ProcessDevicePosition Lambda operate, and the ProcessDevicePositionFirehose supply stream. Full the next steps to create the rule and connect targets:

  1. On the EventBridge console, create a brand new rule.
  2. For Title, enter a reputation (for this publish, ProcessDevicePosition).
  3. For Occasion bus¸ select default.
  4. For Rule kind¸ choose Rule with an occasion sample.
    EventBridge rule detail
  5. For Occasion supply, choose AWS occasions or EventBridge accomplice occasions.
    EventBridge event source
  6. For Technique, choose Use sample kind.
  7. Within the Occasion sample part, specify AWS providers because the supply, Amazon Location Service as the precise service, and Location Gadget Place Occasion because the occasion kind.
    EventBridge creation method
  8. For Goal 1, connect the ProcessDevicePosition Lambda operate as a goal.
    EventBridge target 1
  9. We use Enter transformer to customise the occasion that’s dedicated to the S3 bucket.
    EventBridge target 1 transformer
  10. Configure Enter paths map and Enter template to arrange the payload into the specified format.
    1. The next code is the enter paths map:
      {
          EventType: $.element.EventType
          TrackerName: $.element.TrackerName
          DeviceId: $.element.DeviceId
          SampleTime: $.element.SampleTime
          ReceivedTime: $.element.ReceivedTime
          Longitude: $.element.Place[0]
          Latitude: $.element.Place[1]
      }

    2. The next code is the enter template:
      {
          "EventType":<EventType>,
          "TrackerName":<TrackerName>,
          "DeviceId":<DeviceId>,
          "SampleTime":<SampleTime>,
          "ReceivedTime":<ReceivedTime>,
          "Place":[<Longitude>, <Latitude>]
      }

  11. For Goal 2, select the ProcessDevicePositionFirehose supply stream as a goal.
    EventBridge target 2

This goal requires an IAM position that enables one or a number of data to be written to the Firehose supply stream:

{
    "Model": "2012-10-17",
    "Assertion": [
        {
            "Action": [
                "firehose:PutRecord",
                "firehose:PutRecords"
            ],
            "Useful resource": [
                "arn:aws:firehose:<region>:<account-id>:deliverystream/<delivery-stream-name>"
            ],
            "Impact": "Enable"
        }
    ]
}

Crawl and catalog the information utilizing AWS Glue

After adequate knowledge has been generated, full the next steps:

  1. On the AWS Glue console, select Crawlers within the navigation pane.
  2. Choose the crawlers which were created, location-analytics-glue-crawler-lambda and location-analytics-glue-crawler-firehose.
  3. Select Run.

The crawlers will mechanically classify the information into JSON format, group the data into tables and partitions, and commit related metadata to the AWS Glue Information Catalog.
Crawlers

  1. When the Final run statuses of each crawlers present as Succeeded, verify that two tables (lambda and firehose) have been created on the Tables web page.

The answer partitions the incoming location knowledge based mostly on the deviceid subject. Subsequently, so long as there aren’t any new units or schema modifications, the crawlers don’t must run once more. Nonetheless, if new units are added, or a distinct subject is used for partitioning, the crawlers must run once more.
Tables

You’re now prepared to question the tables utilizing Athena.

Question the information utilizing Athena

Athena is a serverless, interactive analytics service constructed to investigate unstructured, semi-structured, and structured knowledge the place it’s hosted. If that is your first time utilizing the Athena console, observe the directions to arrange a question outcome location in Amazon S3. To question the information with Athena, full the next steps:

  1. On the Athena console, open the question editor.
  2. For Information supply, select AwsDataCatalog.
  3. For Database, select location-analytics-glue-database.
  4. On the choices menu (three vertical dots), select Preview Desk to question the content material of each tables.
    Preview table

The question shows 10 pattern positional data presently saved within the desk. The next screenshot is an instance from previewing the firehose desk. The firehose desk shops uncooked, unmodified knowledge from the Amazon Location tracker.
Query results
Now you can experiment with geospatial queries.The GeoJSON file for the 2021 London ULEZ enlargement is a part of the repository, and has already been transformed into a question suitable with each Athena tables.

  1. Copy and paste the content material from the 1-firehose-athena-ulez-2021-create-view.sql file discovered within the examples/firehose folder into the question editor.

This question makes use of the ST_Within geospatial operate to find out if a recorded place is inside or exterior the ULEZ zone outlined by the polygon. A brand new view known as ulezvehicleanalysis_firehose is created with a brand new column, insidezone, which captures whether or not the recorded place exists throughout the zone.

A easy Python utility is offered, which converts the polygon options discovered within the downloaded GeoJSON file into ST_Polygon strings based mostly on the well-known textual content format that can be utilized instantly in an Athena question.

  1. Select Preview View on the ulezvehicleanalysis_firehose view to discover its content material.
    Preview view

Now you can run queries in opposition to this view to achieve overarching insights.

  1. Copy and paste the content material from the 2-firehose-athena-ulez-2021-query-days-in-zone.sql file discovered within the examples/firehose folder into the question editor.

This question establishes the whole variety of days every automobile has entered ULEZ, and what the anticipated whole costs could be. The question has been parameterized utilizing the ? placeholder character. Parameterized queries mean you can rerun the identical question with totally different parameter values.

  1. Enter the every day price quantity for Parameter 1, then run the question.
    Query editor

The outcomes show every automobile, the whole variety of days spent within the proposed ULEZ, and the whole costs based mostly on the every day price you entered.
Query results
You’ll be able to repeat this train utilizing the lambda desk. Information within the lambda desk is augmented with further automobile particulars current within the automobile upkeep DynamoDB desk on the time it’s processed by the Lambda operate. The answer helps the next fields:

  • MeetsEmissionStandards (Boolean)
  • Mileage (Quantity)
  • PurchaseDate (String, in YYYY-MM-DD format)

You may as well enrich the brand new knowledge because it arrives.

  1. On the DynamoDB console, discover the automobile upkeep desk underneath Tables. The desk identify is offered as output VehicleMaintenanceDynamoTable within the deployed CloudFormation stack.
  2. Select Discover desk objects to view the content material of the desk.
  3. Select Create merchandise to create a brand new file for a automobile.
    Create item
  4. Enter DeviceId (akin to vehicle1 as a String), PurchaseDate (akin to 2005-10-01 as a String), Mileage (akin to 10000 as a Quantity), and MeetsEmissionStandards (with a price akin to False as Boolean).
  5. Select Create merchandise to create the file.
    Create item
  6. Duplicate the newly created file with further entries for different autos (akin to for vehicle2 or vehicle3), modifying the values of the attributes barely every time.
  7. Rerun the location-analytics-glue-crawler-lambda AWS Glue crawler after new knowledge has been generated to substantiate that the replace to the schema with new fields is registered.
  8. Copy and paste the content material from the 1-lambda-athena-ulez-2021-create-view.sql file discovered within the examples/lambda folder into the question editor.
  9. Preview the ulezvehicleanalysis_lambda view to substantiate that the brand new columns have been created.

If errors akin to Column 'mileage' can't be resolved are displayed, the information enrichment just isn’t happening, or the AWS Glue crawler has not but detected updates to the schema.

If the Preview desk choice is just returning outcomes from earlier than you created data within the DynamoDB desk, return the question ends in descending order utilizing sampletime (for instance, order by sampletime desc restrict 100;).
Query results
Now we give attention to the autos that don’t presently meet emissions requirements, and order the autos in descending order based mostly on the mileage per 12 months (calculated utilizing the newest mileage / age of auto in years).

  1. Copy and paste the content material from the 2-lambda-athena-ulez-2021-query-days-in-zone.sql file discovered within the examples/lambda folder into the question editor.
    Query results

On this instance, we will see that out of our fleet of autos, 5 have been reported as not assembly emission requirements. We are able to additionally see the autos which have collected excessive mileage per 12 months, and the variety of days spent within the proposed ULEZ. The fleet operator could now resolve to prioritize these autos for alternative. As a result of location knowledge is enriched with essentially the most up-to-date automobile upkeep knowledge on the time it’s ingested, you possibly can additional evolve these queries to run over an outlined time window. For instance, you could possibly think about mileage modifications throughout the previous 12 months.

As a result of dynamic nature of the information enrichment, any new knowledge being dedicated to Amazon S3, together with the question outcomes, can be altered as and when data are up to date within the DynamoDB automobile upkeep desk.

Clear up

Discuss with the directions within the README file to wash up the sources provisioned for this answer.

Conclusion

This publish demonstrated how you should use Amazon Location, EventBridge, Lambda, Amazon Information Firehose, and Amazon S3 to construct a location-aware knowledge pipeline, and use the collected gadget place knowledge to drive analytical insights utilizing AWS Glue and Athena. By monitoring these property in actual time and storing the outcomes, corporations can derive precious insights on how successfully their fleets are being utilized and higher react to modifications sooner or later. Now you can discover extending this pattern code with your individual gadget monitoring knowledge and analytics necessities.


Concerning the Authors

Alan Peaty is a Senior Accomplice Options Architect at AWS. Alan helps World Methods Integrators (GSIs) and World Impartial Software program Distributors (GISVs) resolve complicated buyer challenges utilizing AWS providers. Previous to becoming a member of AWS, Alan labored as an architect at techniques integrators to translate enterprise necessities into technical options. Exterior of labor, Alan is an IoT fanatic and a eager runner who likes to hit the muddy trails of the English countryside.

Parag Srivastava is a Options Architect at AWS, serving to enterprise clients with profitable cloud adoption and migration. Throughout his skilled profession, he has been extensively concerned in complicated digital transformation tasks. He’s additionally keen about constructing modern options round geospatial features of addresses.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here