playscrapingnodejs:我在 2014 年 7 月做的一件小事。这只是一个简单的开始——我第一次使用 JavaSc

  • R2_570219
    了解作者
  • 1.4MB
    文件大小
  • zip
    文件格式
  • 0
    收藏次数
  • VIP专享
    资源类型
  • 0
    下载次数
  • 2022-06-04 05:31
    上传日期
playcrapingnodejs 这是我在 2014 年 7 月编写的一个简单的 Node.js 脚本。它在 espn.go.com 上抓取了三个特定运动员的统计数据。 这是我第一次使用 JavaScript 编码,也是我第一次使用 Node.js。 我可能会在某个时候回到这个问题并修改这里的一些内容,但我将它保留在这里作为使用 JavaScript 和 Node.js 抓取的示例。
playscrapingnodejs-master.zip
内容介绍
# Async.js [![Build Status via Travis CI](https://travis-ci.org/caolan/async.svg?branch=master)](https://travis-ci.org/caolan/async) Async is a utility module which provides straight-forward, powerful functions for working with asynchronous JavaScript. Although originally designed for use with [Node.js](http://nodejs.org), it can also be used directly in the browser. Also supports [component](https://github.com/component/component). Async provides around 20 functions that include the usual 'functional' suspects (`map`, `reduce`, `filter`, `each`…) as well as some common patterns for asynchronous control flow (`parallel`, `series`, `waterfall`…). All these functions assume you follow the Node.js convention of providing a single callback as the last argument of your `async` function. ## Quick Examples ```javascript async.map(['file1','file2','file3'], fs.stat, function(err, results){ // results is now an array of stats for each file }); async.filter(['file1','file2','file3'], fs.exists, function(results){ // results now equals an array of the existing files }); async.parallel([ function(){ ... }, function(){ ... } ], callback); async.series([ function(){ ... }, function(){ ... } ]); ``` There are many more functions available so take a look at the docs below for a full list. This module aims to be comprehensive, so if you feel anything is missing please create a GitHub issue for it. ## Common Pitfalls ### Binding a context to an iterator This section is really about `bind`, not about `async`. If you are wondering how to make `async` execute your iterators in a given context, or are confused as to why a method of another library isn't working as an iterator, study this example: ```js // Here is a simple object with an (unnecessarily roundabout) squaring method var AsyncSquaringLibrary = { squareExponent: 2, square: function(number, callback){ var result = Math.pow(number, this.squareExponent); setTimeout(function(){ callback(null, result); }, 200); } }; async.map([1, 2, 3], AsyncSquaringLibrary.square, function(err, result){ // result is [NaN, NaN, NaN] // This fails because the `this.squareExponent` expression in the square // function is not evaluated in the context of AsyncSquaringLibrary, and is // therefore undefined. }); async.map([1, 2, 3], AsyncSquaringLibrary.square.bind(AsyncSquaringLibrary), function(err, result){ // result is [1, 4, 9] // With the help of bind we can attach a context to the iterator before // passing it to async. Now the square function will be executed in its // 'home' AsyncSquaringLibrary context and the value of `this.squareExponent` // will be as expected. }); ``` ## Download The source is available for download from [GitHub](http://github.com/caolan/async). Alternatively, you can install using Node Package Manager (`npm`): npm install async __Development:__ [async.js](https://github.com/caolan/async/raw/master/lib/async.js) - 29.6kb Uncompressed ## In the Browser So far it's been tested in IE6, IE7, IE8, FF3.6 and Chrome 5. Usage: ```html <script type="text/javascript" src="async.js"></script> <script type="text/javascript"> async.map(data, asyncProcess, function(err, results){ alert(results); }); </script> ``` ## Documentation ### Collections * [`each`](#each) * [`eachSeries`](#eachSeries) * [`eachLimit`](#eachLimit) * [`map`](#map) * [`mapSeries`](#mapSeries) * [`mapLimit`](#mapLimit) * [`filter`](#filter) * [`filterSeries`](#filterSeries) * [`reject`](#reject) * [`rejectSeries`](#rejectSeries) * [`reduce`](#reduce) * [`reduceRight`](#reduceRight) * [`detect`](#detect) * [`detectSeries`](#detectSeries) * [`sortBy`](#sortBy) * [`some`](#some) * [`every`](#every) * [`concat`](#concat) * [`concatSeries`](#concatSeries) ### Control Flow * [`series`](#seriestasks-callback) * [`parallel`](#parallel) * [`parallelLimit`](#parallellimittasks-limit-callback) * [`whilst`](#whilst) * [`doWhilst`](#doWhilst) * [`until`](#until) * [`doUntil`](#doUntil) * [`forever`](#forever) * [`waterfall`](#waterfall) * [`compose`](#compose) * [`seq`](#seq) * [`applyEach`](#applyEach) * [`applyEachSeries`](#applyEachSeries) * [`queue`](#queue) * [`priorityQueue`](#priorityQueue) * [`cargo`](#cargo) * [`auto`](#auto) * [`retry`](#retry) * [`iterator`](#iterator) * [`apply`](#apply) * [`nextTick`](#nextTick) * [`times`](#times) * [`timesSeries`](#timesSeries) ### Utils * [`memoize`](#memoize) * [`unmemoize`](#unmemoize) * [`log`](#log) * [`dir`](#dir) * [`noConflict`](#noConflict) ## Collections <a name="forEach" / rel='nofollow' onclick='return false;'> <a name="each" / rel='nofollow' onclick='return false;'> ### each(arr, iterator, callback) Applies the function `iterator` to each item in `arr`, in parallel. The `iterator` is called with an item from the list, and a callback for when it has finished. If the `iterator` passes an error to its `callback`, the main `callback` (for the `each` function) is immediately called with the error. Note, that since this function applies `iterator` to each item in parallel, there is no guarantee that the iterator functions will complete in order. __Arguments__ * `arr` - An array to iterate over. * `iterator(item, callback)` - A function to apply to each item in `arr`. The iterator is passed a `callback(err)` which must be called once it has completed. If no error has occured, the `callback` should be run without arguments or with an explicit `null` argument. * `callback(err)` - A callback which is called when all `iterator` functions have finished, or an error occurs. __Examples__ ```js // assuming openFiles is an array of file names and saveFile is a function // to save the modified contents of that file: async.each(openFiles, saveFile, function(err){ // if any of the saves produced an error, err would equal that error }); ``` ```js // assuming openFiles is an array of file names async.each(openFiles, function( file, callback) { // Perform operation on file here. console.log('Processing file ' + file); if( file.length > 32 ) { console.log('This file name is too long'); callback('File name too long'); } else { // Do work to process file here console.log('File processed'); callback(); } }, function(err){ // if any of the file processing produced an error, err would equal that error if( err ) { // One of the iterations produced an error. // All processing will now stop. console.log('A file failed to process'); } else { console.log('All files have been processed successfully'); } }); ``` --------------------------------------- <a name="forEachSeries" / rel='nofollow' onclick='return false;'> <a name="eachSeries" / rel='nofollow' onclick='return false;'> ### eachSeries(arr, iterator, callback) The same as [`each`](#each), only `iterator` is applied to each item in `arr` in series. The next `iterator` is only called once the current one has completed. This means the `iterator` functions will complete in order. --------------------------------------- <a name="forEachLimit" / rel='nofollow' onclick='return false;'> <a name="eachLimit" / rel='nofollow' onclick='return false;'> ### eachLimit(arr, limit, iterator, callback) The same as [`each`](#each), only no more than `limit` `iterator`s will be simultaneously running at any time. Note that the items in `arr` are not processed in batches, so there is no guarantee that the first `limit` `iterator` functions will complete before any others are started. __Arguments__ * `arr` - An array to iterate over. * `limit` - The maximum number of `iterator`s to run at any time. * `iterator(item, callback)` - A function to apply to each item in `arr`. The iterator is passed a `callback(err)` which must be called once it has completed. If no error has occured, the callback should be run without arguments or with an explicit `null` argument. * `callback(err)` - A callback which is called when all `iterator` functions have finished, or an error occurs. __Example__ ```js // Assume documents is an array of JSON objects and requestApi is a // function that interacts
评论
    相关推荐