This service is designed to parse resumes/CVs. It assumes that all files passed to it are resumes/CVs. It does not attempt to detect whether a document is a resume/CV or not. It should not be used to try to extract information from other types of documents.
This service supports all commercially viable document formats used for text documents (image formats are not supported). The service does not support parsing of image files (such as TIFF, JPG) or scanned images within supported document file formats. Always send the original file, not the result of copy/paste, not a conversion by some other software, not a scanned image, and not a version marked up with recruiter notes or other non-resume information. Be aware that if you pass garbage into the service, then you are likely to get garbage out. The best results are always obtained by parsing the original resume/CV file.
If you are running batch transactions (i.e. iterating through files in a folder), make sure that you do not try to reparse a file if you get an exception back from the service since you will get the same result each time and credits will be deducted from your account.
Documents parsed by an account without AI Matching enabled will never be able to be used for matching/searching. For more information on enabling AI Matching reach out to sales@sovren.com
Request Body
DocumentAsBase64Stringrequiredstring
A Base64 encoded string of the resume file bytes. This should use the standard 'base64' encoding as defined in RFC 4648 Section 4 (not the 'base64url' variant). .NET users can use the Convert.ToBase64String(byte[]) method.
DocumentLastModifiedrequiredstring
Mandatory date, in YYYY-MM-DD format, so that the Parser knows how to interpret dates in the document that are expressed as "current" or "as of" or similar. To find out why this is so important and how to calculate/find it, read here. Failing to pass a DocumentLastModified, or passing a DocumentLastModified that is clearly improbable, may result in rejection of data and/or additional charges, and will utterly decimate the usefulness of AI Matching. Use of the DocumentLastModified field is subject to the Acceptable Use Policy.
OutputHtmloptionalbool
When true, the original file is converted to HTML and stored in the Html property.
OutputRtfoptionalbool
When true, the original file is converted to RTF and stored in the Rtf property.
OutputPdfoptionalbool
When true, the original file is converted to PDF and stored in the Pdf property as a byte array.
OutputCandidateImageoptionalbool
When true, if the document contains inline images, the image that is most likely to be a photo of the candidate is returned as a byte array.
Configurationoptionalstring
Optional parser configuration string to be used for parsing. If not specified, the default parser configuration will be used. For more information regarding the parser configuration string and assistance generating one, refer to the Parser Configuration Documentation.
SkillsDataoptionalstring[]
Unavailable except in special cases. Please reach out to support@sovren.com. String[] of your custom skills list names and the Sovren "builtin" skills list. If no list is provided the Sovren builtin skills list will be used. The parser automatically detects language and looks for a corresponding skills list in that language, if no match is found this list is ignored.
NormalizerDataoptionalstring
Unavailable except in special cases. Please reach out to support@sovren.com. Name of your custom normalization data file. If no list is provided the Sovren builtin skills list will be used (english only). When using custom normalization files the language to be used is determined by the Parser (the default fall back language is English if the Parser cannot find a match).
GeocodeOptionsoptionalobject
Get or insert geocode coordinate values (latitude/longitude) during the parse transaction.
GeocodeOptions.IncludeGeocodingoptionalbool
When set to true we will automatically geocode the address that is parsed out leveraging an api call to our/geocode endpoint, and thus will be charged accordingly. This parameter defaults to false.
GeocodeOptions.Provideroptionalstring
The Provider you wish to use to geocode the postal address (current options are "Google", "Bing", or "None"). If not specified, we will default to Google. If you are just trying to update the postal address in the document, please set this to "None". If passing "Google" or "Bing", ProviderKey is requried.
GeocodeOptions.ProviderKeyoptionalstring
The Provider Key for the specified Provider. If using Bing you must specify your own provider key.
GeocodeOptions.PostalAddressoptionalobject
The postal address you wish to geocode. For best results, specify as many of the PostalAddress fields as possible. If provided, this address will be used to get the geocode coordinates instead of the address included in the ParsedDocument (if present), however, the address in the ParsedDocument will not be modified.
The address line (i.e. Street address for U.S. address) for the postal address
GeocodeOptions.GeoCoordinatesoptionalobject
The geographic coordinates (latitude/longitude) for your postal address. Use this if you already have latitude/longitude coordinatesand simply wish to add them to your parsed document. If provided, these values will be inserted into your ParsedDocument and the address included in the ParsedDocument (if present), will not be modified.
When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions.
IndexingOptions.IndexIdoptionalstring
When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions. This determines what index to place the parsed document in. This is case-insensitive.
IndexingOptions.DocumentIdoptionalstring
When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions. This determines what id to give to the parsed document. This is restricted to alphanumeric with dashes and underscores. All values will be converted to lower-case.
IndexingOptions.UserDefinedTagsoptionalstring[]
The user-defined tags you want the document to have.
A response code elaborating on the HTTP status code. The following is a list of codes that can be returned by the service:
Success– Successful transaction
ConversionException- There was an issue converting the document
MissingParameter- A required parameter wasn't provided
InvalidParameter- A parameter was incorrectly specified
AuthenticationError- An error occurred with the credentials provided
Info.Messagestring
This message further describes the code providing additional detail.
Info.TransactionIdstring
The (GUID) id for a specific API transaction. Use this when contacting support@sovren.com about issues.
Info.EngineVersionstring
The version of the parsing/matching engine running under-the-hood.
Info.ApiVersionstring
The version of the API.
Info.TotalElapsedMillisecondsinteger
How long the transaction took on Sovren's server, in milliseconds. If the transaction takes longer to complete on the client side, that extra duration is solely network latency.
Info.TransactionCostdecimal
How many credits the transaction costs.How many credits the transaction costs.
Info.CustomerDetailsobject
Information about the customer who made the API call.
Value.CustomerDetails.AccountIdstring
The AccountId for the account.
Value.CustomerDetails.Namestring
The customer name on the account.
Value.CustomerDetails.IPAddressstring
The client IP Address where the API call originated.
Value.CustomerDetails.Regionstring
The region for the account, also known as the 'Data Center'.
Value.CustomerDetails.CreditsRemainingdecimal
The number of credits remaining to be used by the account.
The number of requests that can be made at one time. If using sub-accounts, this is the maximum number of concurent requests across all accounts, not just this single sub-account.
Value.CustomerDetails.ExpirationDatedate
The date that the current credits expire.
Valueobject
Contains response data for the transaction.
Value.ParsingResponseobject
The status of the parse transaction.
Value.ParsingResponse.Codestring
The following is a list of codes that can be returned by the service:
Success– Successful transaction
WarningsFoundDuringParsing- Parsing was successful. This is not an error code. This is an advanced level message about the document, not about the parsing. For more information, refer to the ResumeQuality section in the parsed document output and to the documentation here.
PossibleTruncationFromTimeout- The timeout occurred before the document was finished parsing which can result in truncation
ConversionException- There was an issue converting the document
Value.ParsingResponse.Messagestring
A short human-readable description explaining the Code value.
Value.GeocodeResponseobject
If geocoding was requested in the ParseOptions.GeocodeOptions the status of the geocode transaction will be output here.
Value.GeocodeResponse.Codestring
The following is a list of codes that can be returned by the service:
Success– Successful transaction
MissingParameter- A required parameter wasn't provided
InvalidParameter- A parameter was incorrectly specified
InsufficientData- The address provided doesn't have enough granularity to be geocoded effectively
CoordinatesNotFound- The geocoding provider couldn't find any coordinates matching the provided address
Value.GeocodeResponse.Messagestring
A short human-readable description explaining theCodevalue.
Value.IndexingResponseobject
If indexing was requested in the ParseOptions.IndexingOptions the status of the index transaction will be output here.
Value.IndexingResponse.Codestring
The following is a list of codes that can be returned by the service:
Success– Successful transaction
MissingParameter- A required parameter wasn't provided
InvalidParameter- A parameter was incorrectly specified
AuthenticationError- An error occurred with the credentials provided
DataNotFound- Data with the specified name wasn't found
Value.IndexingResponse.Messagestring
A short human-readable description explaining theCodevalue.
Value.ResumeDataobject
The main output from the Sovren Resume Parser.
Value.ResumeData.ContactInformationobject
The candidate's contact information found on the resume.
The candidate's location/address. The Parser does not standardize addresses. Address standardization services are available, including for example the Google Maps API, that can take the Parser's contact info fields and standardize/geocode the data.
These values are not very global-friendly, but the Parser does normalize all degrees to one of these pre-defined types. This list is sorted, as well as possible, by increasing level of education, although there are certainly ambiguities from one discipline to another, such as whether professional is above or below bachelors. Here are the possible values:
Normalized GPA is a decimal value that is output only when a GPA has been provided. This value is normalized from 0.0 to 1.0, with 1.0 being the top mark, so that all GPAs across all scales can be compared, taking into account different min/max values and whether high or low numbers are ranked higher. For example:
A paragraph of text that summarizes the candidate's experience. This paragraph is generated based on other data points in the ExperienceSummary. It will be the same language as the resume for Czech, Dutch, English, French, German, Greek, Hungarian, Italian, Norwegian, Russian, Spanish, and Swedish. To always generate the summary in English, set "OutputFormat.AllSummariesInEnglish = True;" in the configuration string when parsing.
In order for this value to be accurate, you must have provided an accurate DocumentLastModified when you parsed this resume.
The number of months of work experience as indicated by the range of start and end date values in the various jobs/positions in the resume. Overlapping date ranges are not double-counted. This value is NOT derived from text like "I have 15 years of experience".
In order for this value to be accurate, you must have provided an accurate DocumentLastModified when you parsed this resume.
The number of months of management experience as indicated by the range of start and end date values in the various jobs/positions in the resume that have been determined to be management-level positions. Overlapping date ranges are not double-counted. This value is NOT derived from text like "I have 10 years of management experience".
In order for this value to be accurate, you must have provided an accurate DocumentLastModified when you parsed this resume.
Job titles are examined to determine the best category for any executive experience. If the candidate posesses no executive experience, "NONE" will be output. Possible values are One of:
A score (0-100), where 0 means a candidate is more likely to have had (and want/pursue) short-term/part-time/temp/contracting jobs and 100 means a candidate is more likely to have had (and want/pursue) traditional full-time, direct-hire jobs
In order for this value to be accurate, you must have provided an accurate DocumentLastModified when you parsed this resume.
The highest score calculated from any of the position titles. The score is based on the wording of the title, not on the experience described within the position description.
Any abnormal findings about the candidate's career will be reported here. For example, if the candidate held a management-level position in a previous job, but not their current job.
True if Sovren found this by matching from a known list of certifications. False if Sovren found this by analyzing the context and determining it was a certification.
Sovren generates several possible variations for some certifications to be used in AI Matching. This greatly improves Matching, since different candidates have different ways of listing a certification. If this certification is a Sovren-created variation of a certification found on the resume, this property will be true.
Value.ResumeData.Licensesobject[]
Licenses found on a resume. These are professional licenses, not driving licenses. For driving licenses, see PersonalAttributes.
Value.ResumeData.Licenses[i].Namestring
The name of the license.
Value.ResumeData.Licenses[i].MatchedFromListbool
True if Sovren found this by matching from a known list of licensense. False if Sovren found this by analyzing the context and determining it was a license.
Any text within the Text that is recognized as a qualification (such as DDS), degree (such as B.S.), or a certification (such as PMP). Each qualification is listed separately.
The primary language of the parsed text. The value is one of the ISO 639-1 codes. When the language could not be automatically determined, it is reported as the special value Invariant/Unknown (iv). The two-letter ISO codes reported by the Parser, such as zh for Chinese, do not differentiate between language variants, such as Mandarin and Cantonese.
For a listing of languages and regions supported the most recent version, you can refer to parser tech specs.
This is an ISO 3066 code that represents the actual cultural context regarding formatting of numbers, dates, character symbols, and so on. This value is usually a simple concatenation of the Language and Country codes, such as "en-US" for US English, but beware that CultureInfo can be set independently of Language and Country to achieve fine-tuned cultural control over parsing, so if you use this value you should not assume that it always matches the Language and Country.
A digital signature used to ensure there is no tampering between parsing and indexing. This prevents Sovren from storing any PII in the AI Matching engine.
The exact text that was used to identify the beginning of the section. If there was no text indicator and the location was calculated, then the value is "CALCULATED"
A list of quality assessments for the resume. These are very useful for providing feedback to candidates about why their resume did not parse properly. These can also be used to determine if a resume is 'high quality' enough to put into your system. More information is available in the Resume Quality Documentation
A list of user-defined tags that are assigned to this resume. These are used to filter search/match queries in the AI Matching Engine.
NOTE: you may add/remove these prior to indexing. This is the only property you may modify prior to indexing.
Value.RedactedResumeDataobject
This property is theValue.ResumeDatawith all of the Personally Identifiable Information (PII) fields such as first name, last name, email addresses, phone numbers, etc. redacted.
Value.ConversionMetadataobject
Information about converting the document to plain text
The suggested extension based on the DetectedType.
Value.ConversionMetadata.OutputValidityCodestring
The computed validity based on the source text. This will indicate whether a document looks like a legitimate resume or not. See here for more details.