Auditing AWS with CloudTrail and Analyzing Logs Using Athena

When you work with AWS at scale, visibility and auditing become as important as compute and storage. Knowing who did what, when, and from where is critical for security, compliance, and troubleshooting.
In this blog, I’ll walk through:
What CloudTrail is and how it helps with auditing
How CloudTrail logs are stored in Amazon S3
How Athena helps query those logs without managing servers
Creating tables and partitions in Athena for cost-efficient querying
This setup is commonly used in real-world AWS environments.
Why CloudTrail + Athena?
CloudTrail records all AWS API activity
S3 stores logs durably and cheaply
Athena lets you query logs directly from S3 using SQL
No servers, no databases to manage, pay only for what you query
Understanding CloudTrail (Auditing in AWS)
CloudTrail is an auditing service provided by Amazon Web Services that records API calls and account activity across your AWS account.
What CloudTrail Records
CloudTrail captures details such as:
Identity of the API caller (IAM user / role)
Time of the API call
Source IP address
AWS service and action performed
Request parameters
Response elements returned by AWS
This makes CloudTrail extremely useful for:
Security investigations
Compliance audits
Debugging accidental changes
Tracking unauthorized access
Default CloudTrail Behavior
By default, CloudTrail stores last 90 days of events
You can download events as JSON or CSV
Events can be stored in:
Amazon S3
CloudWatch Logs (optional)
Creating a CloudTrail Trail
A Trail enables CloudTrail to continuously deliver log files to your S3 bucket.
Key Points About Trails
A trail delivers logs to a specified S3 bucket
By default, a trail applies to all regions
It records events across all AWS partitions
Logs are automatically organized by service and date
Once the trail is created, CloudTrail will start pushing logs to S3 without any manual intervention.
Managing Access and Notifications
IAM Permissions
Using AWS IAM, you can control:
Who can create, modify, or delete CloudTrail trails
Who can start or stop logging
Who can access CloudTrail log buckets in S3
This ensures least-privilege access to sensitive audit data.
SNS Notifications (Optional)
You can also:
Create an SNS topic
Subscribe to it
Receive notifications when new log files are delivered to S3
This is useful for security or compliance alerting.
Querying CloudTrail Logs with Athena
Once logs are in S3, querying them manually is difficult. This is where Athena shines.
What is Athena?
Athena is a serverless, interactive query service that allows you to run SQL queries directly on data stored in S3.
Why Use Athena?
No database setup required
Query data directly from S3
Pay per query, based on data scanned
Ideal for ad-hoc analysis
💰 Athena charges around $5 per TB scanned, so optimization matters.
Athena Tables: How They Work
All Athena tables are external tables
Table schema and actual data are loosely coupled
Dropping a table does not delete data from S3
You can recreate tables anytime without data loss
This makes Athena safe and flexible for log analysis.
Creating a Table in Athena (Using Console)
Before creating a table, make sure:
Log data already exists in S3
You’ve copied the S3 object path
Steps to Create Table in Athena
Navigate to Athena Console
Click Get Started
Choose Create table
Select Create table from S3 bucket data
Choose an existing database or create a new one
Paste the S3 object path of your data
Select file format (CSV / JSON)
Define column names and data types
Click Create table
Start querying






Athena will show data scanned for every query — keep an eye on this.
Cost Optimization in Athena
There are two major ways to reduce Athena query costs:
1. Use Compressed Columnar Formats
Prefer Parquet or ORC
These formats reduce:
Data scanned
Query execution time
Cost
2. Use Partitions (Highly Recommended)
Partitioning helps Athena scan only relevant data, instead of the entire dataset.
Creating Partitions in Athena
Prerequisite
Your S3 folder structure must match the partition column.
Example:
s3://cloudtrail-logs/year=2024/
s3://cloudtrail-logs/year=2025/
Partitioning Strategy
Take similar datasets
Divide them based on year (or month/day)
Steps to Create Partitioned Table
Create folders in S3 like
year=2024,year=2025Go to Athena → Create table from S3
Provide S3 location
Select file format (CSV / JSON)
Define all columns except partition column
Click Add partition
Add
yearas partition columnCreate table




Loading Partitions
For partitioned tables, queries won’t return results until partitions are loaded.
Two Ways to Load Partitions
Click table name → Load partitions
Use SQL query:

Once partitions are loaded, you can start querying efficiently.

Final Thoughts
Using CloudTrail + Athena together gives you:
Complete AWS activity visibility
Serverless log analysis
Scalable and cost-effective auditing
SQL-based troubleshooting without databases
This setup is widely used in production AWS environments, especially for:
Security teams
Compliance audits
Incident investigations
If you’re learning AWS seriously, mastering this flow is a big step forward 🚀

