From the course: Using Microsoft SharePoint Syntex for AI Document Management

Planning before creating your model

- [Instructor] As much as I'd love to dive right in and start creating my form processing model, there's some important work we need to do first. We need to be able to define the information we want to extract from our forms. Taking the time to understand what the goals are may seem like a lot of effort, but I can tell you, from experience, that this will save you a lot of time later. Leaping before you look will very often result in having to go back and redo the model to get it right. I'm going to be working with example files that Microsoft has been nice enough to provide. They give you two sets of forms which is perfect for our purposes. Now, there's a convoluted way to get to these, but you can go straight to the Microsoft Documentation page and download the zip file from here. By the way, I'll be including a document in the course files that includes this link plus links to other helpful information that I think you should have. Anyways, here are the two forms that we're going to have to work with: Adatum and Contoso invoices. Now, what information would we want to be able to capture here? So starting with Adatum, I think I want the date and I want the invoice number, the customer name, of course, and there's a customer ID number. I'll take that. And then, the customer's address, and I'd like to separate that by street and state and post code. So I want all of those things. Now, I've got a salesperson and I've got payment terms. That's really good. Now, I would love to be able to get line item information from the table in this form. Although, I'll tell you that I know we're going to be destined for disappointment on that one, and I will show you what I mean when we get there. But, I do want to get the invoice total. And while we're at it, I might as well get the sales tax, and we'll get the subtotal as well. Just to have that information. Great. Now, let's look at the Contoso form. Now, it looks a little bit different, but there's a lot of the same information available here. Not absolutely everything. There's no salesman. There's no customer ID. There's no terms. However, there is a phone number, and I'd like to get that. Oh, and there's a shipping line. Okay. So I want to get that, also. Now, there's also line items in a table here, although it's laid out differently than the Adatum one was. So from all of this, we can put together a list of all the data that we're going to be extracting, and this will cover both sets of forms. Now, an important thing to remember, you can use the same model to extract information from different sets of forms as long as they are dealing with the same type of data. Each individual look or formatting becomes a collection, but you can have multiple collections within the same model. In our case, we're going to have two collections. And when the time comes, we'll be able to tell Syntex exactly where to look for information in each one.

Contents