NoSQL Databases - CouchDB
CouchDB還是蠻有意思的一個DB, 總結一下, 他重要的特點
1. 最大的特點就是他的file layout and commitment system, 并由此可以保證ACID特性, 在Nosql里面相當有特點, 參考5.1.6
2. 采用View機制, 這個很方便, 通過javascript就可以簡單的定義view, 并可以通過map/reduce邏輯生成view, 但要注意的是, 這是偽map/reduce, 因為只能在單機上運行, 只是使用了這種模型而已.
但存在一個問題, View是在讀時更新的, 所以如果有大量數據更新, View生成的速度就會很慢, 解決方法,
通過cron任務定時對View進行查詢,從而觸發定時的索引更新操作,以減少真正讀操作需要等待的時間.
在1.1.0版本中,添加了一個stale=update_after的指定,可以實現返回老數據后再在后臺更新的功能
3. 完備的備份機制
CouchDB提供了非常方便, 好用的備份機制. 在網絡斷開的時候, 你仍然可以在任意節點上進行讀寫操作, 而不受影響, 當網絡恢復后, 各個備份之間會自動的同步, 這也是couchDB的一大特色.
但是同時我個人覺得也暴露出他的一個弱點, 即scalability, 水平擴展性
CouchDB的水平擴展性, 只能通過備份, 但是不提供sharding的功能, 其實我個人覺得, 本質上根本沒有解決水平擴展問題, 因為所有讀寫操作都只能在單節點完成, 連map/reduce也是基于單節點的document的
所以CouchDB真是一個優點和缺點都很鮮明的, interesting DB, 尤其file layout和append-only模式非常值得借鑒.
?
雖然CouchDB和MongoDB都屬于Document DB, 但是兩者其實真的除了document這個抽象數據模型外, 沒有啥共同點...totally different
相對于MongoDB的中規中矩, 和主流的設計理念, CouchDB的設計似乎非常異類, 難以為廣大的傳統數據庫開發者所接收和理解.
在某些特定的場景下, CouchDB也會是一種不錯的選擇
數據量不是很大, 沒有強烈的sharding的需求
機器node不穩定, 會隨時增加減少, 而不想影響服務
讀操作相對比較固定
讀操作的一致性要求不高, 可以接受一定的寫和讀之間的延遲
重視寫效率和原子性
?
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Document databases are considered by many as the next logical step from simple key-/value-stores to slightly more complex and meaningful data structures as they at least allow to encapsulate key-/value-pairs in documents. On the other hand there is no strict schema documents have to conform to which eliminates the need schema migration efforts (cf.[Ipp09]).
In this chapter Apache CouchDB and MongoDB as the two major representatives for the class of document databases will be investigated.
?
5.1. Apache CouchDB
5.1.1. Overview
CouchDB is a document database written in Erlang. The name CouchDB is nowadays sometimes referred to as “Cluster of unreliable commodity hardware” database.
CouchDB can be regarded as a descendant of Lotus Notes for which CouchDB’s main developer Damien Katz worked at IBM before he later initiated the CouchDB project on his own. A lot of concepts from Lotus Notes can be found in CouchDB: documents, views, distribution, and replication between servers and clients.
CouchDB can be briefly characterized as a document database which is accessible via a RESTful HTTPinterface, containing schema-free documents in a flat address space.
The most notable use of CouchDB in production is ubuntu one ([Can10a], 貌似ubuntu one已經放棄CouchDB) the cloud storage and replication service for Ubuntu Linux ([Can10b]). CouchDB is also part of the BBC’s new web application platform (cf. [Far09]). Furthermore some (less prominent) blogs, wikis, social networks, Facebook apps and smaller web sites use CouchDB as their datastore (cf. [C+10]).
http://wiki.apache.org/couchdb/
?
5.1.2. Data Model and Key Abstractions
Documents
The main abstraction and data structure in CouchDB is a document.
Documents consist of named fields that have a key/name and a value.
A fieldname has to be unique within a document and its assigned value may a string (of arbitrary length), number, boolean, date, an ordered list or an associative map (cf. [Apa10a]).
Documents may contain references to other documents (URIs, URLs) but these do not get checked or held consistent by the database (cf. [PLL09]).
A further limitation is that documents in CouchDB cannot be nested (cf. [Ipp09]).
A wiki article may be an example of such a document:
" Title " : " CouchDB ",
" Last editor " : "172.5.123.91" ,
" Last modified ": "9/23/2010" ,
" Categories ": [" Database ", " NoSQL ", " Document Database "],
" Body ": " CouchDB is a ..." ,
" Reviewed ": false
?
CouchDB considers itself as a semi-structured database.
While relational databases are designed for structured and interdependent data and key-/value-stores operate on uninterpreted, isolated key-/value-pairs
document databases like CouchDB pursue a third path: data is contained in documents which do not correspond to a fixed schema (schema-free) but have some inner structure known to applications as well as the database itself.
The advantages of this approach are that first there is no need for schema migrations which cause a lot of effort in the relational databases world; secondly compared to key-/value-stores data
can be evaluated more sophisticatedly (e. g. in the calculation of views).
In the web application field there are a lot of document-oriented applications which CouchDB addresses as its data model fits this class of applications and the possibility to iteratively extend or change documents can be done with a lot less effort compared to a relational database (cf. [Apa10a]).
介于關系型數據庫和KV數據庫之間, 即可以便于schema migrations , 又比KV能夠描述更負載的結構, 主要是在web application field, 有大量的適用的場景...
數據模型沒有嵌套, 沒有層次, 只有一層的flat namespace, 包含所有的documents
Each CouchDB database consists of exactly one flat/non-hierarchical namespace that contains all the documents which have a unique identifier (consisting of a document id and a revision number aka sequence id) calculated by CouchDB. 因為他不支持nested
Document indexing is done in B-Trees which are indexing the document’s id and revision number (sequence id; cf. [Apa10b]).
?
Views
CouchDBs way to query, present, aggregate and report the semi-structured document data are views (cf.[Apa10a], [Apa10b]).
這個概念應該很容易理解, 很多地方都用到, 無論你數據怎樣存儲, 可以按不同client的要求, 隨意生成各種view, 可以理解成, 關系數據庫里面一個select語句就會生成一個view
A typical example for views is to separate different types of documents (such as blog posts, comments, authors in a blog system) which are not distinguished by the database itself as all of them are just documents to it ([PLL09]).
?
View definitions are strictly virtual and only display the documents from the current database instance, making them separate from the data they display and compatible with replication. CouchDB views are defined inside special **design documents** and can replicate across database instances like regular documents, so that not only data replicates in CouchDB, but entire application designs replicate too.
Views are defined by JavaScript functions which neither change nor save or cache the underlying documents but only present them to the requesting user or client application.
As all documents of the database are processed by a view’s functions this can be time consuming and resource intensive for large databases. Therefore a view is not created and indexed when write operations occur but on demand (at the first request directed to it) and updated incrementally when it is requested again.
View Indexes
Views are a dynamic representation of the actual document contents of a database, and CouchDB makes it easy to create useful views of data. But generating a view of a database with hundreds of thousands or millions of documents is time and resource consuming, it's not something the system should do from scratch each time.
To keep view querying fast, the view engine maintains indexes of its views, and incrementally updates them to reflect changes in the database. CouchDB’s core design is largely optimized around the need for efficient, incremental creation of views and their indexes.
Views and their functions are defined inside special “design” documents, and a design document may contain any number of uniquely named view functions. When a user opens a view and its index is automatically updated, all the views in the same design document are indexed as a single group.
Why are all Views in a single Index
For example:
view1: {"map":"function(doc) {if (doc.type === 'foo') {emit(key, value);}}","reduce": "_count" } view2: {"map":"function(doc) {if (doc.type === 'foo') {emit(key, value);}}","reduce": "_sum" }Here view1 and view2 have exactly the same map function. If they were in different design documents, there would be two b-trees (in two different index files) for exactly the same data.
View存儲在design Document中,請注意這里design Document和View Index是不同的。design Document保存的是view的定義,View Index保存的是針對某個Database進行View操作產生的結果。
To update a view, the component responsible for it (called view-builder) compares the sequence id of the whole database and checks if it has changed since the last refresh of the view.
While the view-builder is updating a view data from the view’s old state can be read by clients. It is also possible to present the old state of the view to one client and the new one to another client as view indexes are also written in an append-only manner and the compactation of view data does not omit an old index state while a client is still reading from it (more on that in subsection 5.1.7).
?
CouchDB View的特點是用map/reduce產生的, 當面對大數據, 要動態生成view, 這是必然選擇...但是這個m/r是單節點的
The JavaScript functions defining a view are called map and reduce which have similar responsibilities as in Google’s MapReduce approach (cf. [DG04]).
The map function gets a document as a parameter, can do any calculation and may emit arbitrary data for it if it matches the view’s criteria; if the given document
does not match these criteria the map function emits nothing.
The data structure emitted by the map function is a triple consisting of the document id, a key and a value which can be chosen by the map function.
After the map function has been executed it’s results get passed to an optional reduce function which is optional but can do some aggregation on the view (cf. [PLL09]).
?
5.1.3. Versioning
Documents are updated optimistically and update operations do not imply any locks.
If an update is issued by some client the contacted server creates a new document revisions in a copy-on-modify manner (see section 3.3) and a history of recent revisions is stored in CouchDB until the database gets compacted the next time.
If a document is updated, not only the current revision number is stored but also a list of revision numbers preceding it, to allow the database (when replicating with another node or
processing read requests) as well as client applications to reason on the revision history in the presence of conflicting versions (cf. [PLL09]).
看這小伙寫的學術論文, 讓我仿佛回到以前讀閱讀理解, 句子太長, 不認真看還真看不懂...
CouchDB寫策略, 稱為樂觀寫, 不用任何鎖, 和Dynamo一樣, 你可以隨便更新, 不用等lock
這樣必然會帶來conflict, 要解決conflict就需要了解更新之間的時間和因果關系, Dynamo是通過clock vector來記錄, 而CouchDB是通過記錄revisions.
CouchDB does not consider version conflicts as an exception but rather a normal case.
They can not only occur by different clients operating on the same CouchDB node but also due to clients operating on different replicas of the same database. It is not prohibited by the database to have an unlimited number of concurrent versions.
A CouchDB database can deterministically detect which versions of document succeed each other and which are in conflict and have to be resolved by the client application.
Conflict resolution may occur on any replica node of a database, as the node (which receiving the resolved version) transmits it to all replicas which have to accept this version as valid. It may occur that conflict resolution is issued on different nodes concurrently; the locally resolved versions on both nodes then are detected to be in conflict and get resolved just like all other version conflicts (cf. [Apa10b]).
CouchDB把conflict看作是正常的, 允許同時存在多個concurrent versions. 但CouchDB可以detect到哪些版本data是有conflict的
也和dynamo一樣, conflict resolution是由client來完成的, 因為client知道bussiness logic, 適合干這個
?
5.1.4. Distribution and Replication
CouchDB is designed for distributed setups that follows a peer-approach where each server has the same set of responsibilities and there are no distinguished roles (like in master/slave-setups, standby-clusters etc.).
類似于mongoDB的replica set, 但是所有節點都是peer to peer, 去中心化的設計
因為他不需要想mongoDB保持一致性, 所以去中心化的設計更簡單, 每個節點都是獨立的, 可以單獨處理r/w操作, 很強的分區容錯性和可用性
Different database nodes can by design operate completely independent and process read and write requests. Two database nodes can replicate databases (documents, document attachments, views) bilaterally if they reach each other via network.
The replication process works incrementally and can detect conflicting versions in simple manner. By the current revision number as well as the list of outdated revision number CouchDB can determine if are conflicting or not;
if there are version conflicts both nodes have a notion of them and can escalate the conflicting versions to clients for conflict resolution;
if there are no version conflicts the node not having the most recent version of the document updates it (cf. [Apa10a], [Apa10b], [PLL09])
The replication process operates incrementally and document-wise.
Incrementally means that only data changed since the last replication gets transmitted to another node and that not even whole documents are transferred but only changed fields and attachment-blobs;
document-wise means that each document successfully replicated does not have to be replicated again if a replication process crashes (cf. [Apa10b]).
Besides replicating whole databases CouchDB also allows for partial replicas. For these a JavaScript filter function can be defined which passes through the data for replication and rejects the rest of the database (cf. [Apa10b]). This partial replication mechanism can be used to shard data manually by defining different filters for each CouchDB node.
?
5.1.5. Interface
CouchDB databases are addressed via a RESTful HTTP interface that allows to read and update documents (cf. [Apa10b]).
?
5.1.6. ACID Properties
The CouchDB file layout and commitment system features all Atomic Consistent Isolated Durable properties.
File layout
On-disk, CouchDB never overwrites committed data or associated structures, ensuring the database file is always in a consistent state. This is a “crash-only" design where the CouchDB server does not go through a shut down process, it's simply terminated.
這個就是CouchDB最大的特點, 所有的更新操作(包括document的創建,修改和刪除)都是以在couch文件尾部追加的方式(即Append方式)進行, 這樣會產生(Multi-Version Concurrency Control )模型.
所以, 并發寫不用等, 不用鎖, 反正你不改原來的只是不斷的append新的版本, 而有對于crash, 也無所謂, 大不了丟失一些未更新完的數據, 但是不會影響老數據.
?
Document updates (add, edit, delete) are serialized, except for binary blobs which are written concurrently. Database readers are never locked out and never have to wait on writers or other readers. Any number of clients can be reading documents without being locked out or interrupted by concurrent updates, even on the same document. CouchDB read operations use a Multi-Version Concurrency Control (MVCC) model where each client sees a consistent snapshot of the database from the beginning to the end of the read operation.
更新操作是serialized的, 都是append, 必須一個append完, 才能繼續append.
讀完全不受影響, 就算同時有client在并發修改該文檔, 你照樣讀, 這個也是由append-only保證的
更牛的是, 還支持隔離性, each client sees a consistent snapshot of the database from the beginning to the end of the read operation. (對于append-only, 這個特性到很容易實現, 設個時間戳過濾, 新的更新都過濾掉就ok)
?
Documents are indexed in B-trees by their name (DocID) and a Sequence ID. Each update to a database instance generates a new sequential number. Sequence IDs are used later for incrementally finding changes in a database. These b-tree indexes are updated simultaneously when documents are saved or deleted. The index updates always occur at the end of the file (append-only updates).
??
理解這塊, 參考下面的資料, CouchDB database文件的結構圖.
Database文件分為header和body, body用來存documents和index
Document存儲時會建立兩個B-tree索引(基于DocID和SequenceID), 雖然B-tree的絕大部分數據是存在body里面的, 但是B-tree的root node是存儲在header中的
而document的更新是append方式的, document更新的同時, index也要一起更新, index的更新也是append方式的
這邊很重要的一點是, 你如果僅僅是不斷的append body, 這些數據對用戶是不可見的, why?
因為index的root是存在header里面的, 每次用戶讀數據的時候, 都是從root node開始遍歷B-tree, 所以如果header里面的root node不更新, 那么你訪問到的數據仍然是老的版本. 在CouchDB中, 所有更新都是append的, 唯獨對于header中的root node的更新是overwrite, 所以為了保證root node的更新正確性, 保存兩份一樣的header.
所以如上圖所示, 綠色的更新內容通過append的方式加到Body里面, 但如果root不更新, 用戶仍然只能看到黃色的舊內容, 只有完成header的更新, 用戶才能看到新的內容.
Commitment system
當CouchDB的文檔更新時,為了保證數據的一致性,Commit分為以下兩步:
在上面兩個過程中,如果在過程1,發生異常(系統崩潰或斷電),那么couch文件的頭信息沒有發生變化,那么所有Append的數據都會被忽略;如果在過程2發生異常,此時Header可能會發生損壞,我們驗證第一個Header和第二個Header,如果任意一個Header可用,那么數據庫文件可用。
CouchDB通過這種方式來保證更新的原子性.
一般數據庫, 如果需要保證原子性, 必須有rollback機制, 因為一般數據庫都是overwrite, 所以你改了一半, crash了, 必須把已經改的改回來, 比較復雜.
而CouchDB就簡單了, append機制, 只要我不改root node, 你新的數據就不會生效, 所以很容易就可以實現all done or nothing的機制
為了防止在更新header是crash導致head數據被寫脹, 存了兩份header, 一個寫亂了, 還能用另一個恢復. 確實很方便
?
5.1.7.Compaction
Wasted space is recovered by occasional compaction. On schedule, or when the database file exceeds a certain amount of wasted space, the compaction process clones all the active data to a new file and then discards the old file. The database remains completely online the entire time and all updates and reads are allowed to complete successfully. The old file is deleted only when all the data has been copied and all users transitioned to the new file.
因此采用追加的方式,所以在數據庫運行一段時間后,我們需要對其進行“瘦身”,情理那些舊的Document數據。這個過程成為 Compaction。在Compation的過程,數據庫仍然可用,只是請注意,在Compation的時候,是通過遍歷DBName.couch文件,將最新的數據拷貝到一個DBName.compat文件中,因此這個過程可能會耗費很大的存儲空間,如果您在系統繁忙(主要是write)的情況下進行Compation,可能會導致你的硬盤空間耗盡,一定注意哦!
?
CouchDB讓人頭痛的十大問題
http://blog.nosqlfan.com/html/3667.html
CouchDB了解(-) 特性及實現
http://www.iteye.com/topic/319839
總結
以上是生活随笔為你收集整理的NoSQL Databases - CouchDB的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Ubuntu12.04使用技巧
- 下一篇: 4款语音播报来电短信应用[Android