Andy86
(Andy Haag)
1
I am using the AmazonTextract .NET SDK to extract texts from images. I need to extract key-value pairs from the extracted texts.
My understanding is that I need to iterate through the list of blocks and check for KEY_VALUE_SET
.
Can someone provide me with a piece of code that will give me key-value pairs after text extraction?
My sample code:
var DocRequest = new AnalyzeDocumentRequest()
{
Document = MyDocument,
FeatureTypes = new List<string> { Amazon.Textract.FeatureType.FORMS, Amazon.Textract.FeatureType.TABLES }
};
var response = client.AnalyzeDocumentAsync(DocRequest);
Geo90
(Geo Murazik)
2
Here is a sample code that iterates through the blocks and extracts key-value pairs:
var DocRequest = new AnalyzeDocumentRequest()
{
Document = MyDocument,
FeatureTypes = new List<string> { Amazon.Textract.FeatureType.FORMS, Amazon.Textract.FeatureType.TABLES }
};
var response = await client.AnalyzeDocumentAsync(DocRequest);
foreach (var item in response.Blocks)
{
if (item.BlockType == BlockType.KEY_VALUE_SET)
{
var keyValuePair = new KeyValuePair();
foreach (var relationship in item.Relationships)
{
if (relationship.Type == RelationshipType.CHILD)
{
foreach (var childId in relationship.Ids)
{
var childBlock = response.Blocks.FirstOrDefault(x => x.Id == childId);
if (childBlock.BlockType == BlockType.KEY)
{
keyValuePair.Key = childBlock.Text;
}
else if (childBlock.BlockType == BlockType.VALUE)
{
keyValuePair.Value = childBlock.Text;
}
}
}
}
// Do something with the keyValuePair
}
}
Note that you need to add the following using statements at the top of your file:
using Amazon.Textract.Model;
using System.Collections.Generic;
using System.Linq;