PAGE
[MS-SQP]: MSSearch Query Protocol
Intellectual Property Rights Notice for Open Specifications Documentation
Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages, standards as well as overviews of the interaction among each of these technologies.
Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you may make copies of it in order to develop implementations of the technologies described in the Open Specifications and may distribute portions of it in your implementations using these technologies or your documentation as necessary to properly document the implementation. You may also distribute in your implementation, with or without modification, any schema, IDLs, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications.
No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.
Patents. Microsoft has patents that may cover your implementations of the technologies described in the Open Specifications. Neither this notice nor Microsoft's delivery of the documentation grants any licenses under those or any other Microsoft patents. However, a given Open Specification may be covered by Microsoft Open Specification Promise or the Community Promise. If you would prefer a written license, or if the technologies described in the Open Specifications are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting [email protected].
Trademarks. The names of companies and products contained in this documentation may be covered by trademarks or similar intellectual property rights. This notice does not grant any licenses under those rights. For a list of Microsoft trademarks, visit www.microsoft.com/trademarks.
Fictitious Names. The example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted in this documentation are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred.
Reservation of Rights. All other rights are reserved, and this notice does not grant any rights other than specifically described above, whether by implication, estoppel, or otherwise.
Tools. The Open Specifications do not require the use of Microsoft programming tools or programming environments in order for you to develop an implementation. If you have access to Microsoft programming tools and environments you are free to take advantage of them. Certain Open Specifications are intended for use in conjunction with publicly available standard specifications and network programming art, and assumes that the reader either is familiar with the aforementioned material or has immediate access to it.
Revision Summary
Date
Revision History
Revision Class
Comments
04/04/2008
0.1
Initial Availability
06/27/2008
1.0
Major
Revised and edited the technical content
12/12/2008
1.01
Editorial
Revised and edited the technical content
07/13/2009
1.02
Major
Revised and edited the technical content
08/28/2009
1.03
Editorial
Revised and edited the technical content
11/06/2009
1.04
Editorial
Revised and edited the technical content
02/19/2010
2.0
Editorial
Revised and edited the technical content
03/31/2010
2.01
Editorial
Revised and edited the technical content
04/30/2010
2.02
Editorial
Revised and edited the technical content
06/07/2010
2.03
Editorial
Revised and edited the technical content
06/29/2010
2.04
Editorial
Changed language and formatting in the technical content.
07/23/2010
2.05
Minor
Clarified the meaning of the technical content.
09/27/2010
2.05
No change
No changes to the meaning, language, or formatting of the technical content.
11/15/2010
2.05
No change
No changes to the meaning, language, or formatting of the technical content.
12/17/2010
2.06
Editorial
Changed language and formatting in the technical content.
03/18/2011
2.06
No change
No changes to the meaning, language, or formatting of the technical content.
06/10/2011
2.06
No change
No changes to the meaning, language, or formatting of the technical content.
01/20/2012
2.7
Minor
Clarified the meaning of the technical content.
04/11/2012
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
07/16/2012
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
09/12/2012
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
10/08/2012
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
02/11/2013
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
07/30/2013
2.7
No change
No changes to the meaning, language, or formatting of the technical content.
Table of Contents
71 Introduction
71.1 Glossary
81.2 References
81.2.1 Normative References
81.2.2 Informative References
81.3 Overview
91.3.1 Remote Querying
91.4 Relationship to Other Protocols
101.5 Prerequisites/Preconditions
101.6 Applicability Statement
101.7 Versioning and Capability Negotiation
101.8 Vendor-Extensible Fields
101.9 Standards Assignments
112 Messages
112.1 Transport
112.2 Message Syntax
112.2.1 Structures
122.2.1.1 CBaseStorageVariant
162.2.1.1.1 CBaseStorageVariant Structures
162.2.1.1.1.1 VT_VECTOR
172.2.1.2 CFullPropSpec
172.2.1.3 CContentRestriction
192.2.1.4 CNatLanguageRestriction
202.2.1.5 CNodeRestriction
202.2.1.6 CPropertyRestriction
212.2.1.7 CSort
222.2.1.8 CVectorRestriction
232.2.1.9 CRestriction
242.2.1.10 CColumnSet
242.2.1.11 CDbColId
252.2.1.12 CDbProp
262.2.1.12.1 Database Properties
262.2.1.13 CDbPropSet
272.2.1.14 CPidMapper
272.2.1.15 CRowsetProperties
292.2.1.16 CRowVariant
292.2.1.17 CSortSet
302.2.1.18 CTableColumn
312.2.1.19 QUERYMETADATA
322.2.2 Message Headers
332.2.3 Messages
342.2.3.1 CPMConnectIn
362.2.3.2 CPMConnectOut
372.2.3.3 CPMCreateQueryIn
392.2.3.4 CPMCreateQueryOut
392.2.3.5 CPMSetBindingsIn
412.2.3.6 CPMGetRowsIn
422.2.3.7 CPMGetRowsOut
432.2.3.8 CPMFetchValueIn
442.2.3.9 CPMFetchValueOut
452.2.3.10 CPMFreeCursorIn
452.2.3.11 CPMFreeCursorOut
452.2.3.12 CPMDisconnect
462.2.4 Errors
473 Protocol Details
473.1 Server Details
473.1.1 Abstract Data Model
483.1.2 Timers
483.1.3 Initialization
483.1.4 Higher-Layer Triggered Events
493.1.5 Message Processing Events and Sequencing Rules
503.1.5.1 Receiving a CPMConnectIn Request
503.1.5.2 Receiving a CPMCreateQueryIn Request
513.1.5.3 Receiving a CPMSetBindingsIn Request
513.1.5.4 Receiving a CPMFetchValueIn Request
523.1.5.5 Receiving a CPMGetRowsIn Request
523.1.5.6 Receiving a CPMFreeCursorIn Request
533.1.5.7 Receiving a CPMDisconnect Request
533.1.6 Timer Events
533.1.7 Other Local Events
533.2 Client Details
533.2.1 Abstract Data Model
533.2.2 Timers
533.2.3 Initialization
543.2.4 Higher-Layer Triggered Events
543.2.4.1 Query Server Query Messages
543.2.4.1.1 Sending a CPMConnectIn Request
553.2.4.1.2 Sending a CPMCreateQueryIn Request
553.2.4.1.3 Sending a CPMSetBindingsIn Request
563.2.4.1.4 Sending a CPMGetRowsIn Request
563.2.4.1.5 Sending a CPMFetchValueIn Request
563.2.4.1.6 Sending a CPMFreeCursorIn Request
573.2.4.1.7 Sending a CPMDisconnect Message
573.2.5 Message Processing Events and Sequencing Rules
573.2.5.1 Receiving a CPMCreateQueryOut Response
573.2.5.2 Receiving a CPMFetchValueOut Response
573.2.5.3 Receiving a CPMGetRowsOut Response
583.2.5.4 Receiving a CPMFreeCursorOut Response
583.2.6 Timer Events
583.2.7 Other Local Events
594 Protocol Examples
594.1 Obtaining Document Identifiers Based on Query Text
665 Security
665.1 Security Considerations for Implementers
665.2 Index of Security Parameters
676 Appendix A: Product Behavior
697 Change Tracking
708 Index
1 Introduction
This document specifies the MSSearch Query Protocol, which enables a protocol client to communicate with a protocol server to issue search queries.
Sections 1.8, 2, and 3 of this specification are normative and can contain the terms MAY, SHOULD, MUST, MUST NOT, and SHOULD NOT as defined in RFC 2119. Sections 1.5 and 1.9 are also normative but cannot contain those terms. All other sections and examples in this specification are informative.
1.1 Glossary
The following terms are defined in [MS-GLOS]:
Coordinated Universal Time (UTC)GUIDhandleHRESULTlanguage code identifier (LCID)little-endiannamed pipe
The following terms are defined in [MS-OFCGLOS]:
binary large object (BLOB)columncommand treefull-text index catalogindex serveritemnatural language querynoise wordproperty identifierquery expansionquery resultquery serverrestrictionrowsearch catalogsearch querysort orderstemmingtoken
The following terms are specific to this document:
SharePoint Search SQL syntax: A set of SQL-based rules that govern the construction of a search query.
MAY, SHOULD, MUST, SHOULD NOT, MUST NOT: These terms (in all caps) are used as described in [RFC2119]. All statements of optional behavior use either MAY, SHOULD, or SHOULD NOT.
1.2 References
References to Microsoft Open Specifications documentation do not include a publishing year because links are to the latest version of the technical documents, which are updated frequently. References to other documents include a publishing year when one is available.
1.2.1 Normative References
We conduct frequent surveys of the normative references to assure their continued availability. If you have any issue with finding a normative reference, please contact [email protected]. We will assist you in finding the relevant information. Please check the archive site, http://msdn2.microsoft.com/en-us/library/E4BD6494-06AD-4aed-9823-445E921C9624, as an additional source.
[IEEE754] Institute of Electrical and Electronics Engineers, "Standard for Binary Floating-Point Arithmetic", IEEE 754-1985, October 1985, http://ieeexplore.ieee.org/servlet/opac?punumber=2355
[MS-ERREF] Microsoft Corporation, "Windows Error Codes".
[MS-LCID] Microsoft Corporation, "Windows Language Code Identifier (LCID) Reference".
[MS-SEARCH] Microsoft Corporation, "Search Protocol".
[MS-SMB] Microsoft Corporation, "Server Message Block (SMB) Protocol".
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997, http://www.rfc-editor.org/rfc/rfc2119.txt
[SALTON] Salton, G., "Automatic Text Processing: The Transformation Analysis and Retrieval of Information by Computer", 1988, ISBN: 0201122278.
If you have any trouble finding [SALTON], please check here.
[UNICODE] The Unicode Consortium, "Unicode Home Page", 2006, http://www.unicode.org/
1.2.2 Informative References
[MSDN-FULLPROPSPEC] Microsoft Corporation, "FULLPROPSPEC", http://msdn.microsoft.com/en-us/library/ms690996.aspx
[MSDN-OLEDBP-OI] Microsoft Corporation, "OLE DB Programming", http://msdn.microsoft.com/en-us/library/502e07a7(VS.80).aspx
[MSDN-PROPSET] Microsoft Corporation, "Property Sets", http://msdn.microsoft.com/en-us/library/ms691041.aspx
[MSDN-QUERYERR] Microsoft Corporation, "Query-Execution Values", http://msdn.microsoft.com/en-us/library/ms690617.aspx
[MS-GLOS] Microsoft Corporation, "Windows Protocols Master Glossary".
[MS-OFCGLOS] Microsoft Corporation, "Microsoft Office Master Glossary".
1.3 Overview
The search service running on a query server helps efficiently organize the extracted features of a collection of items. The MSSearch Query Protocol allows a protocol client to communicate with a protocol server hosting a search service to issue search queries. When processing files, an index server analyzes a set of items, extracts useful information, and then organizes the extracted information in such a way that properties of those items can be efficiently returned in response to search queries. A collection of items that can be queried comprises a search catalog. A search catalog contains a mechanism for quick word matching and a mechanism for quick retrieval of property values. The index server makes search catalogs available to query servers.
Conceptually, a search catalog consists of a logical table of properties with the text or value and corresponding language code identifier (LCID) stored in columns (1) of the table. Each row of the table corresponds to a separate item in the scope of the search catalog, and each column of the table corresponds to a property.
1.3.1 Remote Querying
The MSSearch Query Protocol enables protocol clients to perform search queries against a remote protocol server hosting a search service. See [MS-SEARCH] for more information about the SharePoint Search SQL syntax.
The protocol client initiates a search query with the following steps:
1.The protocol client requests a connection to a protocol server hosting a search service.
2.The protocol client sends the following parameters for the search query:
Rowset properties, for example the search catalog name and configuration information.
The restriction (1) to specify what items are to be included or excluded from the query results.
The order in which the query results are to be returned.
The columns to be returned in the result set.
The maximum number of rows (1) that are to be returned for the search query.
The maximum time for query execution.
3.The protocol client requests a result set from the protocol server, and the protocol server responds by sending the protocol client the property values for the items that were included in the query results for the protocol client's query. After the protocol client is finished with the search query, or no longer requires additional query results, the protocol client contacts the protocol server to release the search query.
After the protocol server has released the search query, the protocol client sends a request to disconnect from the protocol server. The protocol client may also disconnect from the protocol server without issuing a disconnect request. The connection is then closed. Alternatively, the protocol client issues another search query and repeats the sequence from step 2.
1.4 Relationship to Other Protocols
The MSSearch Query Protocol relies on the SMB protocol, as described in [MS-SMB], for message transport. No other protocol depends directly on the MSSearch Protocol.
1.5 Prerequisites/Preconditions
It is assumed that the protocol client has obtained the name of the protocol server and a search catalog name before this protocol is invoked. How a protocol client does this is not addressed in this specification.
It is also assumed that the protocol client and protocol server have a security association that is usable with named pipes, as described in [MS-SMB].
1.6 Applicability Statement
The MSSearch Query protocol is designed for querying search catalogs on a remote server from a client. The MSSearch Query protocol is designed to handle a query load of up to 100 search queries per second. Typical size of the rowset is expected in the range of 0-5000, with up to 4 columns.
1.7 Versioning and Capability Negotiation
None.
1.8 Vendor-Extensible Fields
This protocol uses HRESULT values as defined in [MS-ERREF] section 2.1. Vendors can define their own HRESULT values, provided that they set the C bit (0x20000000) for each vendor-defined value, indicating the value is a customer code.
This protocol uses NTSTATUS values as defined in [MS-ERREF] section 2.3. Vendors can choose their own values for this field, as long as the C bit (0x20000000) is set, indicating it is a customer code.
Property Identifiers
Properties are represented by property identifiers. Each property MUST have a GUID. This identifier consists of a GUID, representing a collection of properties called a property set plus either a string or a 32-bit integer to identify the property within the set. If the integer form of the identifier is used, the value MUST NOT be one of the following: 0x00000000, 0xFFFFFFFF, and 0xFFFFFFFE. Vendors can guarantee that their properties are uniquely defined by placing them in a property set defined by their own GUIDs.
1.9 Standards Assignments
This protocol has no standards assignments, only private assignments that are made by using the allocation procedures described in other protocols.
A named pipe has been allocated to this protocol as described in [MS-SMB]; the assignments are shown in the following table.
Parameter
Value
Reference
Pipe name
\pipe\OSearch
[MS-SMB]
Pipe name
\pipe\SPSearch
[MS-SMB]
2 Messages
The following sections specify how MSSearch Query Protocol messages are transported and common MSSearch Query Protocol data types.
Note All 2-byte, 4-byte, and 8-byte signed and unsigned integers in the following structures and messages MUST be transferred in little-endian byte order.
2.1 Transport
All messages MUST be transported using a named pipe, as specified in [MS-SMB]. The following pipe names are used:
\pipe\OSearch
\pipe\SPSearch
This protocol uses the underlying SMB named pipe protocol to retrieve the identity of the caller that made the connection, as specified in [MS-SMB]. The protocol client MUST set SECURITY_IDENTIFICATION as the ImpersonationLevel in the request [MS-SMB] to open the named pipe.
2.2 Message Syntax
2.2.1 Structures
This section details data structures that are defined and used by the MSSearch Query Protocol. The following table summarizes the data structures defined in this section.
Structure
Description
CBaseStorageVariant
Contains the value on which to perform a match operation for a property that is specified in a CPropertyRestriction structure.
CFullPropSpec
Contains a property specification.
CContentRestriction
Contains a string to match for a property.
CNatLanguageRestriction
Contains a natural language query match for a property.
CNodeRestriction
Contains an array of command tree nodes specifying the restrictions (1) for a search query.
CPropertyRestriction
Contains a property value to match with an operation.
CSort
Identifies a column to sort.
CVectorRestriction
Contains an array of command tree nodes specifying the restrictions (1) for a vector space array query, as specified in [SALTON].
CRestriction
A restriction (1) node in a query command tree.
CColumnSet
Describes the columns to return.
CDbColId
Contains a column identifier.
CDbProp
Contains a rowset property.
CDbPropSet
Contains a set of rowset properties.
CPidMapper
Maps from message internal property identifiers to CFullPropSpecs.
CRowsetProperties
Contains the configuration information for a search query.
CRowVariant
Contains the fixed-size portion of a variable-length data type stored in the CPMGetRowsOut message.
CSortSet
Contains the sort orders (1) for a search query.
CTableColumn
Contains a column for the CPMSetBindingsIn message.
QUERYMETADATA
Contains information about a search query.
2.2.1.1 CBaseStorageVariant
The CBaseStorageVariant structure contains the value on which to perform a match operation for a property specified in the CPropertyRestriction structure.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
Padding1 (variable)
...
vType
vData1
vData2
vValue (variable)
...
Padding1 (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
vType (2 bytes): A type indicator that indicates the type of vValue. It MUST be one of the values specified in the following table.
Value
Meaning
VT_EMPTY
0x0000
vValue is not present.
VT_NULL
0x0001
vValue is not present.
VT_I1
0x0010
A 1-byte signed integer.
VT_UI1
0x0011
A 1-byte unsigned integer.
VT_I2
0x0002
A 2-byte signed integer.
VT_UI2
0x0012
A 2-byte unsigned integer.
VT_BOOL
0x000B
A Boolean value; a 2-byte integer.
NoteContains 0x0000 (FALSE) or 0xFFFF (TRUE).
VT_I4
0x0003
A 4-byte signed integer.
VT_UI4
0x0013
A 4-byte unsigned integer.
VT_R4
0x0004
An IEEE 32-bit floating point number, as specified in [IEEE754].
VT_INT
0x0016
A 4-byte signed integer.
VT_UINT
0x0017
A 4-byte unsigned integer.
Note that this is identical to VT_UI4 except that VT_UINT cannot be used with VT_VECTOR (defined in the following table); the value chosen is a choice made by the higher layer that provides it to MSSearch Query Protocol, but the MSSearch Query Protocol treats VT_UINT and VT_UI4 as identical, with the exception noted previously.
VT_ERROR
0x000A
A 4-byte unsigned integer containing an HRESULT, as specified in [MS-ERREF], section 2.
VT_I8
0x0014
An 8-byte signed integer.
VT_UI8
0x0015
An 8-byte unsigned integer.
VT_R8
0x0005
An IEEE 64-bit floating point number, as specified in [IEEE754].
VT_CY
0x0006
An 8-byte two's complement integer (scaled by 10,000).
VT_DATE
0x0007
A 64-bit floating point number, as specified in [IEEE754], representing the number of days since 00:00:00 on December 31, 1899, Coordinated Universal Time (UTC).
VT_FILETIME
0x0040
A 64-bit integer representing the number of 100-nanosecond intervals since 00:00:00 on January 1, 1601, UTC.
VT_CLSID
0x0048
A 16-byte binary value containing a GUID.
VT_BLOB
0x0041
A 4-byte unsigned integer count of bytes in the binary large object (BLOB) followed by that many bytes of data.
VT_BLOB_OBJECT
0x0046
A 4-byte unsigned integer count of bytes in the BLOB followed by that many bytes of data.
VT_BSTR
0x0008
A 4-byte unsigned integer count of bytes in the string followed by a string, as specified in the following section under vValue.
VT_LPSTR
0x001E
A null-terminated ANSI string.
VT_LPWSTR
0x001F
A null-terminated Unicode (as specified in [UNICODE]) string.
VT_VARIANT
0x000C
When used in a CTableColumn description, vValue is a CRowVariant structure. Otherwise, it is a CBaseStorageVariant structure. MUST be combined with a type modifier of VT_VECTOR.
The following table specifies the type modifiers for vType. Type modifiers can be combined with vType using the bitwise OR operation to change the meaning of vValue to indicate it is one of the possible array types.
Value
Meaning
VT_VECTOR
0x1000
If the type indicator is combined with VT_VECTOR by using an OR operator, vValue is a counted array of values of the indicated type. See section 2.2.1.1.1.1.
This type modifier MUST NOT be combined with the following types: VT_INT, VT_UINT, VT_BLOB, and VT_BLOB_OBJECT.
When the VT_VARIANT vType is used in a CBaseStorageVariant structure, it MUST be combined with a type modifier of VT_VECTOR. There is no such limitation when the VT_VARIANT vType is used in a CTableColumn structure, which specifies individual binding.
vData1 (1 byte): The value of this field MUST be set to 0x00.
vData2 (1 byte): The value of this field MUST be set to 0x00.
vValue (variable): The value for the match operation. The syntax MUST be as indicated in the vType field. The following table summarizes sizes for the vValue field, dependent on the vType field for fixed-length data types. The size is in bytes.
vType
Size
VT_I1, VT_UI1
1
VT_I2, VT_UI2, VT_BOOL
2
VT_I4, VT_UI4, VT_R4, VT_INT, VT_UINT, VT_ERROR
4
VT_I8, VT_UI8, VT_R8, VT_CY, VT_DATE, VT_FILETIME
8
VT_CLSID
16
If vType is set to VT_BLOB or VT_BSTR, the structure of vValue is specified in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
cbSize
blobData (variable)
...
cLen
String (variable)
...
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
cbSize
blobData (variable)
...
For vType set to VT_BLOB, this field is opaque binary BLOB data.
For vType set to VT_BSTR, this field is a set of characters. The protocol client and protocol server MUST be configured to have interoperable character sets (which is not addressed in this protocol). There is no requirement that it be null-terminated.
cbSize (4 bytes): A 32-bit unsigned integer. Indicates the size of the blobData field in bytes. If vType is set to VT_BSTR, cbSize MUST be set to 0x00000000 when the string represented is an empty string.
blobData (variable): MUST be of length cbSize in bytes.
For a vType set to either VT_LPSTR or VT_LPWSTR, the structure of vValue is shown in the following diagram, with these caveats:
If vType is set to VT_LPSTR, cLen indicates the size of the string in ANSI characters, and string is a null-terminated ANSI string.
If vType is set to VT_LPWSTR, cLen indicates the size of the string in Unicode characters, and string is a null-terminated Unicode string.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
cLen
String (variable)
...
cLen (4 bytes): A 32-bit unsigned integer, indicating the size of the string field including the terminating null. A value of 0x00000000 indicates that no such string is present.
String (variable): Null-terminated string. This field MUST be absent if cLen equals 0x00000000.
2.2.1.1.1 CBaseStorageVariant Structures
The VT_VECTOR structure is used in the CBaseStorageVariant structure.
2.2.1.1.1.1 VT_VECTOR
The VT_VECTOR structure is used to pass one-dimensional arrays.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
vVectorElements
vVectorData (variable)
...
vVectorElements (4 bytes): Unsigned 32-bit integer, indicating the number of elements in the vVectorData field.
vVectorData (variable): An array of items that have a type indicated by vType with the 0x1000 bit cleared. The size of an individual fixed-length item can be obtained from the fixed-length data type table, as specified in section 2.2.1.1. The length of this field in bytes can be calculated by multiplying vVectorElements by the size of an individual item.
For variable-length data types, vVectorData contains a sequence of consecutively marshaled simple types in which the type is indicated by vType with the 0x1000 bit cleared.
The elements in the vVectorData field MUST be separated by 0 to 3 padding bytes such that each element begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this array. If padding bytes are present, the value they contain is arbitrary. The contents of the padding bytes MUST be ignored by the receiver.
2.2.1.2 CFullPropSpec
The CFullPropSpec structure contains a property set GUID and a property identifier to uniquely identify a property. A CFullPropSpec instance has a property set GUID and either an integer property identifier or a string property name. For properties to match, the CFullPropSpec structure MUST match the column identifier in the full-text index catalog. There is no conversion between property identifiers and property names. Property names are case insensitive.
For more information, see the Indexing Service definition of FULLPROPSPEC in [MSDN-FULLPROPSPEC].
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
paddingPropSet (variable)
...
_guidPropSet (16 bytes)
...
ulKind
PrSpec
paddingPropSet (variable): This field MUST be 0 to 7 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 8 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
_guidPropSet (16 bytes): The GUID of the property set to which the property belongs.
ulKind (4 bytes): A 32-bit unsigned integer. MUST be set to 0x00000001.
PrSpec (4 bytes): A 32-bit unsigned integer which contains the property identifier.
2.2.1.3 CContentRestriction
The CContentRestriction structure contains a word or phrase to match in the search catalog for a specific property.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_Property (variable)
...
Padding1 (variable)
...
Cc
_pwcsPhrase (variable)
...
Padding2 (variable)
...
Lcid
_ulGenerateMethod
_Property (variable): A CFullPropSpec structure. This field indicates the property on which to perform a match operation.
Padding1 (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
Cc (4 bytes): A 32-bit unsigned integer, specifying the number of characters in the _pwcsPhrase field.
_pwcsPhrase (variable): A Unicode string that is not null-terminated representing the word or phrase to match for the property. This field MUST NOT be empty. The Cc field contains the length of the string.
Padding2 (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
Lcid (4 bytes): A 32-bit unsigned integer, indicating the LCID of _pwcsPhrase, as specified in [MS-LCID].
_ulGenerateMethod (4 bytes): A 32-bit unsigned integer, specifying the method to use when generating alternate word forms. The following table specifies the possible values for this field along with their meanings.
Value
Meaning
GENERATE_METHOD_EXACT 0x00000000
Exact match. Each word in the phrase MUST match exactly in the search catalog.
GENERATE_METHOD_PREFIX 0x00000001
Prefix match. Each word in the phrase is considered a match if the word is a prefix of a crawled string. For example, if the word "barking" is crawled, then "bar" would match when performing a prefix match.
GENERATE_METHOD_INFLECT 0x00000002
Matches inflectional forms of a word. An inflectional form of a word is a variant of the root word in the same part of speech that has been modified, according to linguistic rules of a given language. For example, inflectional forms of the verb swim in English include swim, swims, swimming, and swam.
2.2.1.4 CNatLanguageRestriction
The CNatLanguageRestriction structure contains a natural language query match for a property.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_Property (variable)
...
_padding_cc (variable)
...
Cc
_pwcsPhrase (variable)
...
_padding_lcid (variable)
...
Lcid
_Property (variable): A CFullPropSpec structure. This field indicates the property on which to perform the match operation.
_padding_cc (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
Cc (4 bytes): A 32-bit unsigned integer, specifying the number of characters in the _pwcsPhrase field.
_pwcsPhrase (variable): A Unicode string that is not null-terminated with the text to be searched for within the specific property. This string MUST NOT be empty. The Cc field contains the length of the string.
_padding_lcid (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
Lcid (4 bytes): A 32-bit unsigned integer indicating the LCID of _pwcsPhrase, as specified in [MS-LCID].
2.2.1.5 CNodeRestriction
The CNodeRestriction structure contains an array of command tree restriction (1) nodes for constraining the results of a search query.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_cNode
_paNode (variable)
...
_cNode (4 bytes): A 32-bit unsigned integer specifying the number of CRestriction structures contained in the _paNode field.
_paNode (variable): An array of CRestriction structures. Structures in the array MUST be separated by 0 to 3 padding bytes such that each structure begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this array. If padding bytes are present, the value they contain is arbitrary. The content of the padding bytes MUST be ignored by the receiver.
2.2.1.6 CPropertyRestriction
The CPropertyRestriction structure contains a property to get from each row, a comparison operator, and a constant. For each row, the value returned by the specific property in the row is compared against the constant to see if it has the relationship specified by the _relop field. For the comparison to be true, the data types of the values must match.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_relop
_Property (variable)
...
_prval (variable)
...
_relop (4 bytes): A 32-bit unsigned integer specifying the relation to perform on the property. _relop MUST be one of the following values with an optional bitwise-OR mask applied to the value.
Value
Meaning
PREQ
0x00000004
An equality comparison.
PRNE
0x00000005
A not-equal comparison.
The possible values for the optional mask are listed in the following table.
Value
Meaning
PRAny
0x00000200
The restriction (1) is true if any element in the property value has the relationship with some element in the _prval field.
_Property (variable): A CFullPropSpec structure indicating the property on which to perform a match operation.
_prval (variable): A CBaseStorageVariant structure containing the value to relate to the property
2.2.1.7 CSort
The CSort structure identifies a column, direction and LCID to sort by.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
pidColumn
dwOrder
Locale
pidColumn (4 bytes): A 32-bit unsigned integer. This is the index in CPidMapper for the property to sort by.
dwOrder (4 bytes): A 32-bit unsigned integer. MUST be one of the following values, specifying how to sort based on the column.
Value
Meaning
QUERY_SORTASCEND
0x00000000
The rows are to be sorted in ascending order based on the values in the column specified.
QUERY_SORTDESCEND
0x00000001
The rows are to be sorted in descending order based on the values in the column specified.
Locale (4 bytes): A 32-bit unsigned integer indicating the LCID (as specified in [MS-LCID]) of the column. The LCID determines the sorting rules to use when sorting textual values.
2.2.1.8 CVectorRestriction
The CVectorRestriction structure contains a weighted OR operation over restriction (1) nodes. Vector restrictions (1) represent queries using the full text vector space model of ranking (as specified in [SALTON]). In addition to the OR operation they also compute a rank depending on the ranking algorithm.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_pres (variable)
...
_padding (variable)
...
_ulRankMethod
_pres (variable): A CNodeRestriction command tree on which a ranked OR operation is to be performed.
_padding (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
_ulRankMethod (4 bytes): A 32-bit unsigned integer specifying a ranking algorithm that MUST be set to one of the following values.
Value
Meaning
VECTOR_RANK_MIN
0x00000000
Use the minimum algorithm as specified in [SALTON].
VECTOR_RANK_MAX
0x00000001
Use the maximum algorithm as specified in [SALTON].
VECTOR_RANK_INNER
0x00000002
Use the inner product algorithm as specified in [SALTON].
VECTOR_RANK_DICE
0x00000003
Use the Dice coefficient algorithm as specified in [SALTON].
VECTOR_RANK_JACCARD
0x00000004
Use the Jaccard coefficient algorithm as specified in [SALTON].
2.2.1.9 CRestriction
The CRestriction structure contains a restriction (1) node in a query command tree.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_ulType
Weight
Restriction (variable)
...
_ulType (4 bytes): A 32-bit unsigned integer indicating the restriction (1) type used for the command tree node. The type determines what is found in the Restriction field of the structure, as described in the following table. This MUST be set to one of the following values.
Value
Meaning
RTNone
0x00000000
The node represents a noise word in a vector query.
RTAnd
0x00000001
The node contains a CNodeRestriction on which a logical AND operation is to be performed.
RTOr
0x00000002
The node contains a CNodeRestriction on which a logical OR operation is to be performed.
RTNot
0x00000003
The node contains a CRestriction on which a NOT operation is to be performed.
RTContent
0x00000004
The node contains a CContentRestriction.
RTProperty
0x00000005
The node contains a CPropertyRestriction.
RTProximity
0x00000006
The node contains a CNodeRestriction with an array of CContentRestriction structures. Any other kind of restriction (1) is undefined. The restriction (1) requires the words or phrases found in the CContentRestriction structures to be within a query server defined range to be a match. The query server can also compute a rank based on how far apart the words or phrases are.
RTVector
0x00000007
The node contains a CVectorRestriction.
RTNatLanguage
0x00000008
The node contains a CNatLanguageRestriction.
RTPhrase
0xFFFFFFFD
The node contains a CNodeRestriction on which a phrase match is to be performed.
Weight (4 bytes): A 32-bit unsigned integer representing the weight of the node. Weight indicates the node's importance relative to other nodes in the query command tree. Higher weight values are more important.
Restriction (variable): The restriction (1) type for the command tree node. The syntax MUST be as indicated by the _ulType field.
2.2.1.10 CColumnSet
The CColumnSet structure specifies the column numbers to be returned. This structure is always used in reference to a specific CPidMapper structure.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
Count
indexes (variable)
...
Count (4 bytes): A 32-bit unsigned integer specifying the number of elements in the indexes array.
indexes (variable): An array of 4-byte unsigned integers representing zero-based indexes into the aPropSpec array in the corresponding CPidMapper structure. The corresponding property values are returned as columns in the result set.
2.2.1.11 CDbColId
The CDbColId structure contains a column identifier.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
eKind
padding (variable)
...
GUID (16 bytes)
...
ulId
eKind (4 bytes): MUST be set to 0x00000001.
padding (variable): This field MUST be 0 to 7 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 8 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
GUID (16 bytes): The property GUID.
ulId (4 bytes): This field contains an unsigned 32-bit integer specifying the property identifier.
2.2.1.12 CDbProp
The CDbProp structure contains an OLE-DB DBPROP database property. These properties control how search queries are interpreted by the query server. For more information about OLE-DB, see [MSDN-OLEDBP-OI].
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
DBPROPID
DBPROPOPTIONS
DBPROPSTATUS
colid (variable)
...
_padding (variable)
...
vValue (variable)
...
DBPROPID (4 bytes): A 32-bit unsigned integer indicating the property identifier. This field uniquely identifies each property in a particular search query but has no other interpretation. In particular, it is not a ulId as found in the CDbColId structure.
DBPROPOPTIONS (4 bytes): Property options. This field MUST be set to 0x00000000.
DBPROPSTATUS (4 bytes): Property status. This field MUST be set to 0x00000000.
colid (variable): A CDbColId structure that defines the database property being passed.
_padding (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
vValue (variable): A CBaseStorageVariant containing the property value.
2.2.1.12.1 Database Properties
The MSSearch Query Protocol supports the following database properties to control the behavior of the query server. These properties are grouped into three property sets identified in the guidPropertySet field of the CDbPropSet structure.
The following table lists the properties that are part of the DBPROPSET_FSCIFRMWRK_EXT property set.
Value
Meaning
DBPROP_CI_CATALOG_NAME
0x00000002
Specifies the name of the search catalog or search catalogs to query. Value MUST be a VT_LPWSTR or a VT_BSTR. The structure MUST be set such that the eKind field contains 0x00000001 and the GUID and ulID fields are filled with zeros.
DBPROP_CI_QUERY_TYPE
0x00000007
Specifies the type of query using a CDbColId structure. The structure MUST be set such that the eKind field contains 0x00000001 and the GUID and ulID fields are filled with zeros. When this property is specified the vValue field MUST contain 0x00000000, indicating a regular search query.
2.2.1.13 CDbPropSet
The CDbPropSet structure contains a set of properties. The first field, guidPropertySet, is not padded and will begin where the previous structure in the message ended (as indicated by the "previous structure" entry in the following diagram). The 1-byte length of "previous structure" is arbitrary, and is not meant to suggest guidPropertySet will begin on any particular boundary. However, the cProperties field MUST be aligned to begin at a multiple of 4 bytes from the beginning of the message, and, hence, the format is depicted as follows.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
guidPropertySet (16 bytes)
...
_padding (variable)
...
cProperties
aProps (variable)
...
guidPropertySet (16 bytes): A GUID identifying the property set. MUST be set to the binary form of the value DBPROPSET_FSCIFRMWRK_EXT (A9BD1526-6A80-11D0-8C9D-0020AF1D740E), identifying the property set of the properties contained in the aProps field.
_padding (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the following field begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this structure. If this field is present (that is, length nonzero), the value it contains is arbitrary. The content of this field MUST be ignored by the receiver.
cProperties (4 bytes): A 32-bit unsigned integer containing the number of elements in the aProps array.
aProps (variable): An array of CDbProp structures containing properties. Structures in the array MUST be separated by 0 to 3 padding bytes such that each structure begins at an offset that is a multiple of 4 bytes from the beginning of the message that contains this array. If padding bytes are present, the value they contain is arbitrary. The content of the padding bytes MUST be ignored by the receiver.
2.2.1.14 CPidMapper
The CPidMapper structure contains an array of property specifications and serves to map from a property offset to a CFullPropSpec. The more compact property offsets are used to name properties in other parts of the protocol. Because offsets are more compact they allow shorter property references in other parts of the protocol.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
paddingCount (variable)
...
count
aPropSpec (variable)
...
paddingCount (variable): This field MUST be 0 to 3 bytes in length. The length of this field MUST be such that the byte offset from the beginning of the message to the count field is a multiple of 4 bytes. The value of the bytes can be any arbitrary value, and MUST be ignored by the receiver.
count (4 bytes): A 32-bit unsigned integer containing the number of elements in the aPropSpec array.
aPropSpec (variable): An array of CFullPropSpec structures.
2.2.1.15 CRowsetProperties
The CRowsetProperties structure contains configuration information for a search query.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_uBooleanOptions
_ulMaxOpenRows
_ulMemoryUsage
_cMaxResults
_cCmdTimeout
_dwQueryID (optional)
_uBooleanOptions (4 bytes): This field specifies various query Boolean options.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
A A
A B
A C
A D
A E
A F
A - U0: MUST be set to 1 and MUST be ignored.
B through O - U1, U2, A1, A2, U3, U4, U5, U6, U7, U8, U9, U10, U11, U12: MUST be set to 0 and MUST be ignored.
P - IN: Specifies the desired noise word behavior. Its value MUST be 1 if the noise words are ignored, and 0 otherwise.
Q through U - U13, U14, U15, U16, U17: MUST be set to 0 and MUST be ignored.
V - AT: Specifies the desired token inclusion behavior that MUST be used by the server unless the inclusion behavior is specified explicitly by keyword syntax as specified in [MS-SEARCH] section 2.2.11.8. If the value is 0, the server MUST return only the search results containing all of the tokens in the query. Otherwise, the server MUST return search results that contain any of the tokens.
W - ES: Specifies the desired stemming query expansion behavior. If the value is 1, the server MUST use stemming query expansion. If the value is 0, the server MUST NOT use stemming query expansion.
X through AF - EP, EN, IT, U18, U19, U20, U21, U22, U23: MUST be set to 0 and MUST be ignored.
_ulMaxOpenRows (4 bytes): A 32-bit unsigned integer. MUST be set to 0x00000000. Not used, and MUST be ignored.
_ulMemoryUsage (4 bytes): A 32-bit unsigned integer. MUST be set to 0x00000000. Not used, and MUST be ignored.
_cMaxResults (4 bytes): A 32-bit unsigned integer, specifying the maximum number of rows that are to be returned for the query. If _cMaxResults is set to 0x00000000 then server assumes all results are requested and behaves as if 0xFFFFFFFF was specified in _cMaxResults.
_cCmdTimeout (4 bytes): A 32-bit unsigned integer, specifying the number of seconds at which a query is to time out, counting from the time the query starts executing on the server. On a timeout, the query is interrupted and terminated, and the server continues to communicate with the client using the regular sequence of messages. A value of 0x00000000 means that the query is not to time out.
_dwQueryID (4 bytes): A 32-bit unsigned integer that identifies the query for debugging purposes. This field MUST only be present if both protocol client and protocol server are capable of handling _dwQueryID value as indicated by the _iClientVersion field in the CPMConnectIn message and the _serverVersion field in the CPMConnectOut message. The value of this field can be any arbitrary value. The protocol server SHOULD use this value in any logging related to the query being executed (if any).
2.2.1.16 CRowVariant
The CRowVariant structure contains the fixed-size portion of a variable-length data type stored in the CPMGetRowsOut message.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
vType
reserved1
reserved2
Offset (variable)
...
vType (2 bytes): A type indicator, indicating the type of vValue. It MUST be set to VT_I4 or VT_LPWSTR, as specified in section 2.2.1.1.
reserved1 (2 bytes): Not used. MUST be ignored on receipt.
reserved2 (4 bytes): Not used. MUST be ignored on receipt.
Offset (variable): An offset to variable-length data (for example, a string). This MUST be a 32-bit value (4-bytes long) if 32-bit offsets are being used (per the rules in section 2.2.3.7), or a 64-bit value (8-bytes long) if 64-bit offsets are being used.
2.2.1.17 CSortSet
The CSortSet structure contains the sort order of the search query.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
Count
sortArray (variable)
...
count (4 bytes): A 32-bit unsigned integer specifying the number of elements in sortArray.
sortArray (variable): An array of CSort structures describing the order in which to sort the results of the search query. Structures in the array MUST be separated by 0 to 3 padding bytes such that each structure has a 4-byte alignment from the beginning of a message. Such padding bytes can be any arbitrary value, and MUST be ignored on receipt.
2.2.1.18 CTableColumn
The CTableColumn structure contains a column of a CPMSetBindingsIn message
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
PropSpec (variable)
...
vType
ValueUsed
_padding1 (optional)
ValueOffset
ValueSize
StatusUsed
_padding2 (optional)
StatusOffset
LengthUsed
_padding3 (optional)
LengthOffset
PropSpec (variable): A CFullPropSpec structure.
vType (4 bytes): A 32-bit reserved field. MUST be set to 0x0000000C.
ValueUsed (1 byte): A 1-byte reserved field. MUST be set to 0x01.
_padding1 (1 byte): A 1-byte field.
Note This field MUST be inserted before ValueOffset if, without it, ValueOffset would not begin at an even offset from the beginning of the message. The value of this byte is arbitrary, and MUST be ignored.
ValueOffset (2 bytes): An unsigned 2-byte integer specifying the offset of the column value in the row.
ValueSize (2 bytes): An unsigned 2-byte integer specifying the size of the column value in bytes.
StatusUsed (1 byte): A 1-byte reserved field. MUST be set to 0x01.
_padding2 (1 byte): A 1-byte field. Note This field MUST be inserted before StatusOffset if, without it, the StatusOffset field would not begin at an even offset from the beginning of the message. The value of this byte is arbitrary, and MUST be ignored.
StatusOffset (2 bytes): An unsigned 2-byte integer. Specifies the offset of the column status in the row.
Status is represented as one byte in the response by the offset specified in the StatusOffset request field. The status byte MUST be equal to 0x00.
LengthUsed (1 byte): A reserved 1-byte field. MUST be set to 0x01.
_padding3 (1 byte): A 1-byte field.
NoteThis field MUST be inserted before LengthOffset if, without it, LengthOffset would not begin at an even offset from the beginning of a message. The value of this byte is arbitrary, and MUST be ignored.
LengthOffset (2 bytes): An unsigned 2-byte integer specifying the offset of the column length in the row. In CPMGetRowsOut, length is represented by a 32-bit unsigned integer by the offset specified in LengthOffset.
2.2.1.19 QUERYMETADATA
The QUERYMETADATA structure contains a serialized representation of the metadata about a search query. This structure is returned in the vValue field of the CPMFetchValueOut message.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
vType
Reserved0
vLen
NoiseWords (variable)
...
SpellingSuggestion (variable)
...
QeryTerms (variable)
...
TermIds (variable)
...
EstimatedCount
vType (2 bytes): A 16-bit reserved field describing the type of the property. MUST be set to VT_BLOB as specified in section 2.2.1.1.
Reserved0 (2 bytes): A reserved 16-bit field. MUST be set to 0x0000.
vLen (4 bytes): A 32-bit field. MUST be set to the length in bytes of the NoiseWords, SpellingSuggestion, QueryTerms, TermIds and EstimatedCount fields.
NoiseWords (variable): A CBaseStorageVariant containing terms which were treated as noise words during query execution. The vType field of this structure MUST be set to VT_VECTOR | VT_LPWSTR (0x101F). The vValue field MUST contain an array of 0 or more query terms which were treated as noise words by the query. See serialization for vValue in section 2.2.1.1.
SpellingSuggestion (variable): A CBaseStorageVariant containing terms which have been determined by the server as being alternate spelling of terms specified in the query. The vType field of this structure MUST be set to VT_LPWSTR (0x001F). The vValue field MUST contain space-delimited keywords and any keywords which are spelling suggestions MUST be prefixed with "" and post fixed with "". If there are no spelling suggestions then the vValue MUST contain a null-terminated empty VT_LPWSTR. See serialization for vValue in section 2.2.1.1.
QueryTerms (variable): A CBaseStorageVariant containing terms from the query. The vType field of this structure MUST be set to VT_VECTOR | VT_LPWSTR (0x101F). The vValue field MUST contain an array of 0 or more query terms. See serialization for vValue in section 2.2.1.1.
TermIds (variable): A CBaseStorageVariant containing term identifiers from the search query. The vType field of the TermIds field MUST be set to VECTOR | VT_UI4 (0x1013), and the vVectorElements field of the TermIds structure MUST be set to the same value as the vVectorElements field of the QueryTerms structure. The vVectorData field SHOULD contain term identifier values that are specific to the protocol server implementation. The protocol client MUST ignore the values in vVectorData.
See serialization for vValue in section 2.2.1.1.
EstimatedCount (4 bytes): A 32-bit field containing the estimated number of total results, regardless of the number of rows requested by the protocol client.
2.2.2 Message Headers
All MSSearch Query Protocol messages have a 16-byte header.
The following diagram shows the MSSearch Query Protocol message header format.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_msg
_status
_ulChecksum
_ulReserved2
_msg (4 bytes): A 32-bit integer that identifies the type of message following the header.
The following table lists the MSSearch Query Protocol messages and the integer values specified for each message. As shown in the table, some values identify two messages in the table. In those instances, the message following the header can be identified by the direction of the message flow. If the direction is protocol client to protocol server, the message with "In" appended to the message name is indicated. If the direction is protocol server to protocol client, the message with "Out" appended to the message name is indicated.
Value
Meaning
0x000000C8
CPMConnectIn or CPMConnectOut
0x000000C9
CPMDisconnect
0x000000CA
CPMCreateQueryIn or CPMCreateQueryOut
0x000000CB
CPMFreeCursorIn or CPMFreeCursorOut
0x000000CC
CPMGetRowsIn or CPMGetRowsOut
0x000000D0
CPMSetBindingsIn
0x000000E4
CPMFetchValueIn or CPMFetchValueOut
_status (4 bytes): An HRESULT, indicating the status of the requested operation. When sent by the protocol client, this field MAY contain any value and the protocol server MUST ignore the value.
_ulChecksum (4 bytes): A 32-bit integer field. The _ulChecksum MUST be calculated as specified in section 3.2.4 for the following messages:
CPMConnectIn
CPMCreateQueryIn
CPMSetBindingsIn
CPMGetRowsIn
CPMFetchValueIn
Note For all other messages from the protocol client, _ulChecksum MUST be ignored by the receiver. A protocol client MUST ignore the _ulChecksum field.
_ulReserved2 (4 bytes): A 32-bit integer field. If the message type is CPMGetRowsIn and 64-bit offsets are used, as specified in section 2.2.3.7, this field MUST contain the upper 32 bits of the base value to use for pointer calculations in the row buffer (the _ulClientBase field of the CPMGetRowsIn message contains the lower 32 bits). Otherwise, the value MUST be set to 0x00000000.
2.2.3 Messages
The following sections specify the MSSearch Query Protocol messages.
2.2.3.1 CPMConnectIn
The CPMConnectIn message begins a session between the protocol client and protocol server.
The format of the CPMConnectIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_iClientVersion
_fClientIsRemote
_cbBlob1
_paddingcbdBlob2 (variable)
...
_cbBlob2
_padding
...
...
MachineName (variable)
...
UserName (variable)
...
_paddingcPropSets (variable)
...
cPropSets
PropertySets (variable)
...
_paddingExtPropset (variable)
...
cExtPropSet
ExtPropertySets (variable)
...
_iClientVersion (4 bytes): A 32-bit integer indicating whether the protocol client is assumed capable of handling 64-bit offsets in CPMGetRowsOut messages.and whether the protocol client is assumed capable of handling the _dwQueryID field in the CRowsetProperties structure. The value for this field MUST be set to one of the following values.
Value
Meaning
0x00000102
The protocol client is not capable of handling 64-bit offsets in CPMGetRowsOut messages and is not capable of handling the _dwQueryID field in the CRowsetProperties structure.
0x00010102
The protocol client is capable of handling 64-bit offsets in CPMGetRowsOut messages and is not capable of handling the _dwQueryID field in the CRowsetProperties structure.
0x00000103
The protocol client is not capable of handling 64-bit offsets in CPMGetRowsOut messages and is capable of handling the _dwQueryID field in the CRowsetProperties structure.
0x00010103
The protocol client is capable of handling 64-bit offsets in CPMGetRowsOut messages and is capable of handling the _dwQueryID in the CRowsetProperties structure.
_fClientIsRemote (4 bytes): A Boolean value indicating whether the protocol client is running on a different computer than the protocol server.
_cbBlob1 (4 bytes): A 32-bit unsigned integer indicating the size in bytes of the cPropSets and PropertySets fields, combined.
_paddingcbdBlob2 (variable): This field MUST be 0 to 7 bytes in length. The length of this field MUST be such that the byte offset from the beginning of the message to the first structure contained in the _cbBlob2 field is a multiple of 8 bytes. The value of the bytes can be any arbitrary value, and MUST be ignored by the receiver.
_cbBlob2 (4 bytes): A 32-bit unsigned integer indicating the size in bytes of the cExPropSet and ExtPropertySets fields, combined.
_padding (12 bytes): 12 bytes of padding. MUST be ignored.
MachineName (variable): The computer name of the protocol client. The name string MUST be a null-terminated array of less than 512 Unicode characters, including the NULL terminator.
UserName (variable): A string that represents the user name of the person who is running the application that invoked this protocol. The name string MUST be a null-terminated array of fewer than 512 Unicode characters when concatenated with MachineName.
_paddingcPropSets (variable): This field MUST be 0 to 7 bytes in length. The number of bytes MUST be the number required to make the byte offset of the cPropSets field a multiple of 8 from the beginning of the message that contains this structure. The value of the bytes can be any arbitrary value, and MUST be ignored by the receiver.
cPropSets (4 bytes): A 32-bit unsigned integer indicating the number of CDbPropSet structures following this field. This field MUST be set to 0x00000001.
PropertySets (variable): An array of CDbPropSet structures. The number of elements in this array MUST be equal to cPropSets.
_paddingExtPropset (variable): This field MUST be 0 to 7 bytes in length. The number of bytes MUST be the number required to make the byte offset of the cExtPropSets field from the beginning of the message that contains this structure equal a multiple of 8. The value of the bytes can be any arbitrary value, and MUST be ignored by the receiver.
cExtPropSet (4 bytes): A 32-bit reserved field. MUST be set to 0x00000001.
ExtPropertySets (variable): An array of CDbPropSet structures. The number of elements in this array MUST be equal to cExtPropSet.
2.2.3.2 CPMConnectOut
The CPMConnectOut message contains a response to a CPMConnectIn message. The format of the CPMConnectOut message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_serverVersion
_reserved (20 bytes)
...
_serverVersion (4 bytes): A 32-bit integer, indicating whether the server can support 64-bit offsets, as specified in section 2.2.3.7 and whether the server can support the _dwQueryID field in the CRowsetProperties structure.
Value
Meaning
0x00000102
The protocol server can only send 32-bit offsets and can only work with CRowsetProperties structure without the _dwQueryID field.
0x00010102
The protocol server can send 32-bit or 64-bit offsets and can only work with a CRowsetProperties structure without the _dwQueryID field.
0x00000103
The protocol server can only send 32-bit offsets and can work with a CRowsetProperties structure with and without the _dwQueryID field.
0x00010103
The protocol server can send 32-bit or 64-bit offsets and can work with CRowsetProperties structure with and without the _dwQueryID field.
_reserved (20 bytes): A reserved field. The protocol server SHOULD set all bits of this field to 0. The protocol client MUST ignore this value.
2.2.3.3 CPMCreateQueryIn
The CPMCreateQueryIn message creates a new search query. The format of the CPMCreateQueryIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
Size
A
Align0 (variable)
...
ColumnSet (variable)
...
B
Reserved2
Align1 (variable)
...
Restriction (variable)
...
C
Align2 (variable)
...
SortSet (variable)
...
Reserved0
Align3 (variable)
...
RowSetProperties (variable)
...
PidMapper (variable)
...
Reserved1
LCID
Size (4 bytes): A 32-bit unsigned integer indicating the number of bytes from the beginning of this field to the end of the message.
A - CColumnSetPresent (1 byte): A byte field indicating if the ColumnSet field is present. This field MUST be set to one of the following values.
Value
Meaning
0x00
The ColumnSet field MUST be absent.
0x01
The ColumnSet field MUST be present.
Align0 (variable): A field structure of 0, 1, 2 or 3 bytes that is used to align the next field to a 4-byte boundary. MUST be ignored by the protocol server.
ColumnSet (variable): A CColumnSet structure containing the property offsets for properties in CPidMapper that are returned as a column.
B - CRestrictionPresent (1 byte): A byte field indicating if the Restriction field is present.
Note If set to any nonzero value, the Restriction field MUST be present. If set to 0x00, Restriction MUST be absent.
Reserved2 (2 bytes): A 16-bit reserved field. MUST contain the value 0x0101.
Align1 (variable): A field structure of 0, 1, 2 or 3 bytes that is used to align the next field to a 4-byte boundary. MUST be ignored by the protocol server.
Restriction (variable): A CRestriction structure containing the command tree of the search query.
C - CSortSetPresent (1 byte): A byte field indicating if the SortSet field is present.
Note If set to any nonzero value, the SortSet field MUST be present. If set to 0x00, SortSet MUST be absent.
Align2 (variable): A field structure of 0, 1, 2 or 3 bytes that is used to align the next field to a 4-byte boundary. MUST be ignored by the protocol server.
SortSet (variable): A CSortSet structure indicating the sort order of the search query.
Reserved0 (1 byte): An 8-bit reserved field reserved for future use. MUST be ignored by the protocol server.
Align3 (variable): A field structure of 0, 1, 2 or 3 bytes that is used to align the next field to a 4-byte boundary. MUST be ignored by the protocol server.
RowSetProperties (variable): A CRowsetProperties structure providing configuration information for the search query.
PidMapper (variable): A CPidMapper structure that maps from property offsets to full property descriptions.
Reserved1 (4 bytes): A 32-bit reserved field. MUST be set to 0x00000000.
LCID (4 bytes): A 32-bit unsigned integer, indicating the LCID of the search query, as specified in [MS-LCID].
2.2.3.4 CPMCreateQueryOut
The CPMCreateQueryOut message contains a response to a CPMCreateQueryIn message.
The format of the CPMCreateQueryOut message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_fTrueSequential
_fWorkIdUnique
Cursor
_fTrueSequential (4 bytes): A 32-bit unsigned integer that indicates whether the protocol server can use the search catalog in such a way that query results will likely be delivered faster. This field MUST be set to one of the values in the following table.
Value
Meaning
0x00000000
For the search query provided in CPMCreateQueryIn, there would be a bigger latency in delivering query results.
0x00000001
For the search query provided in CPMCreateQueryIn, the protocol server can use the search catalog in such a way that query results will likely be delivered faster.
_fWorkIdUnique (4 bytes): A Boolean value indicating if the document identifiers pointed to by the cursors are unique throughout query results. MUST be set to one of the following values.
Value
Meaning
0x00000000
The document identifiers are unique only throughout the rowset.
0x00000001
The document identifiers are unique across multiple query results.
Cursor (4 bytes): A 32-bit unsigned integer representing the handle to the cursor that identifies the query being executed.
2.2.3.5 CPMSetBindingsIn
The CPMSetBindingsIn message requests the binding of columns to a rowset. The protocol server will reply to the CPMSetBindingsIn request message using the header section of the CPMSetBindingsIn message with the results of the request contained in the _status field. The format of the CPMSetBindingsIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_hCursor (optional)
_cbRow (optional)
_cbBindingDesc (optional)
_dummy (optional)
cColumns (optional)
aColumns (variable)
...
padding (variable)
...
_hCursor (4 bytes, optional): A 32-bit value representing the handle from the CPMCreateQueryOut message that identifies the search query for which to set bindings. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
_cbRow (4 bytes, optional): A 32-bit unsigned integer indicating the size in bytes of a row. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
_cbBindingDesc (4 bytes, optional): A 32-bit unsigned integer indicating the length in bytes of the fields following the _dummy field. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
_dummy (4 bytes, optional): This field is unused, and MUST be ignored. It can be set to any arbitrary value. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
cColumns (4 bytes, optional): A 32-bit unsigned integer indicating the number of elements in the aColumns array. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
aColumns (variable): An array of the CTableColumn structures describing the columns of a row in the rowset. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server. Structures in the array MUST be separated by 0 to 3 padding bytes such that each structure has a 4-byte alignment from the beginning of a message. Such padding bytes can be any arbitrary value, and MUST be ignored on receipt.
padding (variable): This field MUST be of the length necessary (0 to 3 bytes) to pad the message out to a multiple of 4 bytes in length. The value of the padding bytes can be any arbitrary value. This field MUST be ignored by the receiver. This field MUST be present when the message is sent by the protocol client, and MUST be absent when the message is sent by the protocol server.
2.2.3.6 CPMGetRowsIn
The CPMGetRowsIn message requests rows from a search query. The format of the CPMGetRowsIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_hCursor
_cRowsToTransfer
_cbRowWidth
_cbSeek
_cbReserved
_cbReadBuffer
_ulClientBase
Reserved4
Reserved1
Reserved2
Reserved3
_hCursor (4 bytes): A 32-bit value representing the handle from the CPMCreateQueryOut message identifying the search query for which to retrieve rows.
_cRowsToTransfer (4 bytes): A 32-bit unsigned integer indicating the maximum number of rows the protocol client expects to receive in response to this message.
_cbRowWidth (4 bytes): A 32-bit unsigned integer indicating the length of a row in bytes.
_cbSeek (4 bytes): A 32-bit reserved field. MUST be set to 0x0000000C.
_cbReserved (4 bytes): A 32-bit unsigned integer indicating the offset, in bytes, of the Rows field in the CPMGetRowsOut response message. This offset begins from the first byte of the message header and MUST be set such that the Rows field follows the Reserved1 field.
_cbReadBuffer (4 bytes): A 32-bit unsigned integer. Note This field MUST be set to the maximum of the following two values rounded up to the nearest 512-byte multiple: the value of _cbRowWidth, or 1,000 times the value of _cRowsToTransfer. The value MUST NOT exceed 0x00004000.
_ulClientBase (4 bytes): A 32-bit unsigned integer indicating the base value to use for pointer calculations in the row buffer. If 64-bit offsets are being used, the _ulReserved2 field of the message header is used as the upper 32 bits, and _ulClientBase is used as the lower 32 bits of a 64-bit value. See section 2.2.3.7.
Reserved (4 bytes)4: A 32-bit reserved field. MUST be set to 0x00000000.
Reserved1 (4 bytes): A 32-bit reserved field. MUST be set to 0x00000001.
Reserved2 (4 bytes): A 32-bit reserved field. MUST be set to 0x00000000.
Reserved3 (4 bytes): A 32-bit reserved filed. MUST be set to 0x00000000.
2.2.3.7 CPMGetRowsOut
The CPMGetRowsOut message replies to a CPMGetRowsIn message with the rows of a search query. Protocol servers MUST format offsets to variable-length data types in the row field as follows.
The protocol client indicated it was a 32-bit system (0x00000102 or 0x00000103 in the _iClientVersion field of CPMConnectIn): Offsets are 32-bit integers.
The protocol client indicated it was a 64-bit system (_iClientVersion set to 0x00010102 or 0x00010103 in CPMConnectIn), and the protocol server indicated that it was a 32-bit system (_serverVersion set to 0x00000102 or 0x00000103 in CPMConnectOut): Offsets are 32-bit integers.
The protocol client indicated it was a 64-bit system (_iClientVersion set to 0x00010102 or 0x00010103 in CPMConnectIn), and the protocol server indicated that it was a 64-bit system (_serverVersion set to 0x00010102 or 0x00010103 in CPMConnectOut): Offsets are 64-bit integers.
The format of the CPMGetRowsOut message that follows the header is depicted in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_cRowsReturned
Reserved0
Reserved2
Reserved1
paddingRows (variable)
...
Rows (variable)
...
_cRowsReturned (4 bytes): A 32-bit unsigned integer indicating the number of rows returned in the Rows field.
Reserved0 (4 bytes): A 32-bit reserved field. MUST be set to 0x000000.
Reserved2 (4 bytes): A 32-bit reserved field. MUST be ignored by the receiver.
Reserved1 (4 bytes): A 32-bit reserved field. MUST be ignored by the receiver.
paddingRows (variable): This field MUST be of sufficient length (0 to _cbReserved-1 bytes) to pad the Rows field to _cbReserved offset from the beginning of a message where _cbReserved is the value in the CPMGetRowsIn message. Padding bytes used in this field can be any arbitrary value. This field MUST be ignored by the receiver.
Rows (variable): Row data is formatted as prescribed by column information in the most recent CPMSetBindingsIn message. Rows MUST be stored in forward order (for example, row 1 before row 2). Fixed-sized columns MUST be stored at the offsets specified by the most recent CPMSetBindingsIn message.
Columns MUST be stored as CRowVariants with vType set to VT_I4 or VT_LPWSTR. Because the total size of the Rows field is specified by the _cbReadBuffer field of the CPMGetRowsIn message, if row data does not fit exactly into the Rows field of the CPMGetRowsOut message then there will be unused padding within the Rows field.
2.2.3.8 CPMFetchValueIn
The CPMFetchValueIn message requests metadata about the most recent search query initiated with a CPMCreateQueryIn message.
The format of the CPMFetchValueIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_wid
_cbSoFar
_cbPropSpec
_cbChunk
PropSpec (variable)
...
_padding (variable)
...
_wid (4 bytes): A 32-bit reserved field. MUST be set to 0xFFFFFFFF.
_cbSoFar (4 bytes): A 32-bit unsigned integer containing the number of bytes previously transferred for this property. Note This field MUST be set to 0x00000000 in the first message.
_cbPropSpec (4 bytes): A 32-bit unsigned integer containing the size of the PropSpec field in bytes. The value MUST NOT be 0x00000000 for the first message. It MUST be 0x00000000 for subsequent messages.
_cbChunk (4 bytes): A 32-bit unsigned integer containing the maximum number of bytes that the sender can accept in a CPMFetchValueOut message. The value of this field MUST be greater than 0x0000001C.
PropSpec (variable): A CFullPropSpec structure specifying the property to retrieve. If _cbPropSpec is not 0 then the following field values MUST be set on this structure, otherwise this structure MUST be omitted:
Field
Value
_guidPropSet
E83758B4-0C6E-435B-BCC6-268021EFAD6C
ulKind
PRSPEC_PROPID (0x00000001)
PrSpec
0x00000000
_padding (variable): This field MUST be of the length necessary (0 to 3 bytes) to pad the message out to a multiple of 4 bytes in length. The value of the padding bytes can be any arbitrary value. This field MUST be ignored by the receiver.
2.2.3.9 CPMFetchValueOut
The CPMFetchValueOut message replies to a CPMFetchValueIn message with a metadata about the most recent previously issued query. As specified in section 3.1.5.4, this message is sent after each CPMFetchValueIn message until all bytes of the metadata are transferred. The message length including header MUST be less than or equal to the value of _cbChunk specified in the CPMFetchValueIn message.
The format of the CPMFetchValueOut message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30
1
_cbValue
_fMoreExists
_fValueExists
vValue (variable)
...
_cbValue (4 bytes): A 32-bit unsigned integer containing the size in bytes of the vValue field.
_fMoreExists (4 bytes): A Boolean value indicating whether there are additional CPMFetchValueOut messages available. If this value is set to 0x00000001, the message size MUST be equal to the value of the _cbChunk field in CPMFetchValueIn.
Value
Meaning
0x00000000
There are no additional data available.
0x00000001
There are additional data available.
_fValueExists (4 bytes): A reserved 32-bit unsigned integer. MUST be set to 0x00000001.
vValue (variable): A portion of a byte array containing a QUERYMETADATA where the offset of the beginning of the portion is the value of _cbSoFar in CPMFetchValueIn.
2.2.3.10 CPMFreeCursorIn
The CPMFreeCursorIn message requests the release of a cursor. The format of the CPMFreeCursorIn message that follows the header is shown in the following diagram.
0
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
20
1
2
3
4
5
6
7
8
9
30