• w7_514219
  • 65KB
  • zip
  • 0
  • VIP专享
  • 0
  • 2022-06-15 11:57
简单的大表 概述 是Google支持的数据存储,用于存储大量数据并保持非常低的读取延迟。 使用Bigtable的主要缺点是Google目前没有官方的异步客户端。 在Spotify中,我们一直在使用RPC客户端,这很难使用。 该库旨在通过使与Bigtable的最常见的交互操作简单易用,同时又不妨碍您使用RPC客户端执行任何操作来解决此问题。 要使用maven导入,请将其添加到pom中: < dependency> < groupId>com.spotify</ groupId> < artifactId>simple-bigtable</ artifactId> < version>LATEST_RELEASE</ version> </ dependency> 原始RPC客户端与Bigtable客户端比较 使用RPC客户端 举一个使用基础RPC客
# Simple Bigtable ## Overview [Cloud Bigtable]( is a datastore supported by Google for storing huge amounts of data and maintaining very low read latency. The main drawback to using Bigtable is that Google does not currently have an official asynchronous client. Within Spotify we have been using the RPC client which is a pain to use. This library aims to fix that by making the most common interactions with Bigtable clean and easy to use while not preventing you from doing anything you could do with the RPC client. To import with maven, add this to your pom: ```xml <dependency> <groupId>com.spotify</groupId> <artifactId rel='nofollow' onclick='return false;'>simple-bigtable</artifactId> <version>LATEST_RELEASE</version> </dependency> ``` ## Raw RPC Client vs Bigtable Client Comparison ### Using The RPC Client To give an example of using the base RPC client (which gives the `BigtableSession` object), this is how you would request a single cell from Bigtable. ```java String projectId; String zone; String cluster; BigtableSession session; String fullTableName = String.format("projects/%s/zones/%s/clusters/%s/tables/%s", projectId, zone, cluster, "table"); // Could also use a filter chain, but you can't actually set all the filters within the same RowFilter object // without a merge or chain of some sort final RowFilter.Builder filter = RowFilter.newBuilder().setFamilyNameRegexFilter("column-family"); filter.mergeFrom(RowFilter.newBuilder().setColumnQualifierRegexFilter(ByteString.copyFromUtf8("column-1")).build()); filter.mergeFrom(RowFilter.newBuilder().setCellsPerColumnLimitFilter(1).build()); // By default it is 1 final ReadRowsRequest readRowsRequest = ReadRowsRequest.newBuilder() .setTableName(fullTableName) .setRowKey(ByteString.copyFromUtf8("row")) .setNumRowsLimit(1) .setFilter( .build(); final ListenableFuture<List<Row>> future = session.getDataClient().readRowsAsync(readRowsRequest); final ListenableFuture<Cell> cell = FuturesExtra.syncTransform(future, rows -> { // This doesnt actually check if the row, column family, and qualifier exist // IndexOutOfBoundsException might be thrown return rows.get(0).getFamilies(0).getColumns(0).getCells(0); }); ``` ### Bigtable Client The goal of this client is to let you query what you want with minimal overhead (there should be no need to create all these filter objects) as well as give you the object you want without needing to constantly convert a list of rows down to a single cell. Note that these examples use a String as a row key. Bigtable keys are really byte arrays. Strings in this api is just a convenience. Under the cover the string "row" is converted to a ByteString. In reality you should use byte arrays as keys as that will be more efficient. Here is the same query as above using this client wrapper. ```java String projectId; String zone; String cluster; BigtableSession session; Bigtable bigtable = new Bigtable(session, projectId, zone, cluster); final ListenableFuture<Optional<Cell>> cell ="table") .row("row") .column("family:qualifier") // specify both column family and column qualifier separated by colon .latestCell() .executeAsync(); ``` ## Performing Reads The goal of this client is to make the most tedious and common interactions with Bigtable as painless as possible. Therefore reading data is an extremely large focus. Here are some examples of reading data. Get full column family within row ```java final ListenableFuture<Optional<Family>> family ="table") .row("row") .family("family") .executeAsync(); ``` Get multiple columns within a row (Currently all need to be in the same column family but hopefully that gets fixed) ```java // Get the entire column final ListenableFuture<List<Column>> family ="table") .row("row") .family("family") .columnQualifiers(Lists.newArrayList("qualifier-1", "qualifier-2")) .executeAsync(); // Get the latest cell in each column final ListenableFuture<List<Column>> family ="table") .row("row") .family("family") .columnQualifiers(Lists.newArrayList("qualifier1", "qualifier2")) .latestCell() .executeAsync(); ``` Get columns within a single family and within column qualifier range ```java final ListenableFuture<List<Column>> columns ="table") .row("row") .family("family") .columns() .startQualifierInclusive(startBytestring) .endQualifierExclusive(endBytestring) .executeAsync(); ``` Get cells between certain timestamps within a column ```java final ListenableFuture<List<Cell>> cells ="table") .row("row") .column("family:qualifier") .cells() .startTimestampMicros(someTimestamp) .endTimestampMicros(someLatertimestamp) .executeAsync(); ``` Get the latest cell of a certain value within a column ```java final ListenableFuture<Optional<Cell>> cells ="table") .row("row") .column("family:qualifier") .cells() .startValueInclusive(myValueByteString) .endValueInclusive(myValueByteString) .latest() .executeAsync(); ``` Get the latest cell of a between 2 timestamps within a column for multiple rows ```java final ListenableFuture<List<Row>> cells ="table") .rows(ImmutableSet.of("row1", "row2")) .column("family:qualifier") .cells() .startTimestampMicros(someTimestamp) .endTimestampMicros(someLatertimestamp) .latest() .executeAsync(); ``` Get the multiple column families and column qualifiers (will match all combinations) ```java final ListenableFuture<List<Row>> cells ="table") .row("row") .families(ImmutableSet.of("family1, family2")) .columnQualifiers(ImmutableSet.of("qualifier1", "qualifier2") .cells() .startTimestampMicros(someTimestamp) .endTimestampMicros(someLatertimestamp) .latest() .executeAsync(); ``` Get all rows between different ranges or with certain specific keys (these functions add rows to the row set, instead of filtering) ```java final ListenableFuture<List<Row>> rows ="table") .rows() .addRowRangeOpen(myStartKeyOpen, myEndKeyOpen) // add an exclusive range .addRowRangeClosed(myStartKeyClosed, myEndKeyClosed) // add an inclusive range .addKeys(extraKeys) // add some keys you always want .executeAsync(); ``` Note that currently there is no half open, half closed range. ## Other Operations The client supports other Bigtable operations as well, with hopefully the rest of all possible operations coming soon. ### Mutations (Writes, Deletions) Mutations are performed on the row level with many mutations possible within a single call. Mutations include writing new values as well as deleting a column, column family, or an entire row and all data help in each. Write a new cell within a column ```java final ListenableFuture<Empty> mutation = bigtable.mutateRow("table", "row") .write("family:qualifier", ByteString.copyFromUtf8("value")) .executeAync() ``` Perform multiple writes in different columns setting an explicit timestamp on some ```java final ListenableFuture<Empty> mutation = bigtable.mutateRow("table", "row") .write("family:qualifier", ByteString.copyFromUtf8("value-1"), timestampMicros) .write("family", "qualifier", ByteString.copyFromUtf8("value-2")) .executeAync() ``` Delete a column and then write to the same column ```java final Empty mutation = bigtable.mutateRow("table", "row") .deleteColumn("family:qualifier") .write("family:qualifier", ByteString.copyFromUtf8("brand-new-value")) .execute() ``` ### ReadModifyWrite (Atomically Update or Append To A Column) ReadModifyWrite is useful for either incrementing the latest cell within a column by a long or appending bytes to the value. If the column is empty, is will write a