Tokenizer_sample 联合开发网

Pudn.com > 下载中心 > 网络编程 > Tokenizer_sample

Tokenizer_sample

所属分类：网络编程
开发工具：C#
文件大小：4KB
下载次数：0
上传日期：2020-08-02 18:27:09
上传者：凱莉

说明：用于认知搜索的微软云平台示例技能代码编程
(Tokenizer sample skill for cognitive search)

文件列表:

azuredeploy.json (4293, 2020-08-01)
host.json (26, 2020-08-01)
local.settings.json (137, 2020-08-01)
Tokenizer.cs (4206, 2020-08-01)
Tokenizer.csproj (880, 2020-08-01)

--- page_type: sample languages: - csharp products: - azure - azure-search name: Tokenizer sample skill for cognitive search description: This custom skill extracts normalized non-stop words from a text using the ML.NET library. azureDeploy: https://raw.githubusercontent.com/Azure-Samples/azure-search-power-skills/master/Text/Tokenizer/azuredeploy.json --- # Tokenizer This custom skill extracts normalized non-stop words from a text using [the ML.NET library](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml?view=ml-dotnet). The language used for stop word removal can be optionally specified with the `languageCode` parameter using the ISO 639-1 code. Supported languages are: * Arabic(ar) * Czech (cs) * Danish (da) * Dutch (nl) * English (en), is the default language used if none is specified. * French (fr) * German (de) * Italian (it) * Japanese (ja) * Norwegian Bokml (nb) * Polish (pl) * Portuguese (pt) * Spanish (es) * Swedish (sv) * Russian (ru) ## Requirements This skills have no additional requirements than the ones described in [the root `README.md` file](../../README.md). ## Deployment [![Deploy to Azure](https://azuredeploy.net/deploybutton.svg)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure-Samples%2Fazure-search-power-skills%2Fmaster%2FText%2FTokenizer%2Fazuredeploy.json) ## tokenizer ### Sample Input: ```json { "values": [ { "recordId": "record1", "data": { "text": "ML.NET's RemoveDefaultStopWords API removes stop words from tHe text/string. It requires the text/string to be tokenized beforehand.", "languageCode": "en" } } ] } ``` ### Sample Output: ```json { "values": [ { "recordId": "record1", "data": { "words": [ "mlnets", "removedefaultstopwords", "api", "removes", "stop", "words", "textstring", "requires", "textstring", "tokenized" ] }, "errors": [], "warnings": [] } ] } ``` ## Sample Skillset Integration In order to use this skill in a cognitive search pipeline, you'll need to add a skill definition to your skillset. Here's a sample skill definition for this example (inputs and outputs should be updated to reflect your particular scenario and skillset environment): ```json { "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill", "description": "Tokenizer", "uri": "[AzureFunctionEndpointUrl]/api/tokenizer?code=[AzureFunctionDefaultHostKey]", "batchSize": 1, "context": "/document/content", "inputs": [ { "name": "text", "source": "/document/content" }, { "name": "languageCode", "source": "document/language" } ], "outputs": [ { "name": "words", "targetName": "words" } ] } ```

近期下载者：

相关文件：

评论：[我要评论] [举报此文件]

收藏者：