dynamodb : Scan vs Query using Python

Question

I have a table in dynamodb with the following column elements:

clientId : Primary partition Key
timeId : Sort Key

clientId is to differentiate records of different clients and timeId is just a epoch timestamp linked to specific clientID. An example output of the table would look like this:

clientId             timeId              Bucket         dateColn
0000000028037c08     1544282940.0495     MyAWSBucket    1544282940
0000000028037c08     1544283640.119842   MyAWSBucket    1544283640

I am using the following code to fetch the records:

ap.add_argument("-c","--clientId",required=True,help="name of the client")
ap.add_argument("-st","--startDate",required=True,help="start date to filter")
ap.add_argument("-et","--endDate",required=True,help="end date to filter")
args = vars(ap.parse_args())

dynamodb = boto3.resource('dynamodb', region_name='us-west-1')

table = dynamodb.Table('MyAwsBucket-index')

response = table.query(
    KeyConditionExpression=Key('clientId').eq(args["clientId"]) and Key('timeId').between(args['startDate'], args['endDate'])
)

Essentially I am trying to subset the dynamodb first based on clientId and then followed by two timestamps - a start time and an end time. I could fetch all the records without the timestamps using the following:

KeyConditionExpression=Key('clientId').eq(args["clientId"])

However, when I include the startdate and time, I am getting the following error:

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: clientId

How do I resolve this and use both clientId as well as the start time and end time. I read that I could use scan but also read somewhere scan don't fetch the records quickly. Since my table has millions of rows, now sure if I should use scan. Can someone help?

Also my start time and end time search inputs are integers as given in dateColn as compared to float type as given in timeId. Not sure if that is creating any errors.

Simrandeep Singh · Accepted Answer · 2018-12-10 17:45:37Z

2

I read that I could use scan but also read somewhere scan don't fetch the records quickly. Since my table has millions of rows, now sure if I should use scan.

DynamoDB scan is a very expensive operation as it reads all the documents thereby consuming lot of the provisioned throughput. Hence scan should be refrained as much as possible to query the table.

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: clientId

This error implies that the value of partition key clientId is not specified in the query. This is a bit confusing as the value may indeed be non-empty but it might mean that the partition key is expecting number but args["clientId"] is a string which is not acceptable. Please refer this documentation for how to specify the intended data type of the arguments.

answered Dec 10, 2018 at 17:45

Simrandeep Singh

5474 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Apricot Over a year ago

Thank you for your reply. I will go through the documentation and come back

Matt · Accepted Answer · 2019-02-19 22:05:44Z

1

An obvious issue with your query is that you are using and instead of & By using 'and' you are basically removing the first part of your query.

answered Feb 19, 2019 at 22:05

Matt

111 bronze badge

1 Comment

Apricot Over a year ago

Thank you...Since I could not get it working and had tighter deadlines, I moved to lambda posting to Elasticsearch. Thank you. I will try this one next time.

Collectives™ on Stack Overflow

dynamodb : Scan vs Query using Python

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related