This repository was archived by the owner on Mar 11, 2026. It is now read-only.
Key bugs using a 32-bit signed int calculated using a fast hash#6
Open
davcamer wants to merge 3 commits intoACRA:masterfrom
Open
Key bugs using a 32-bit signed int calculated using a fast hash#6davcamer wants to merge 3 commits intoACRA:masterfrom
davcamer wants to merge 3 commits intoACRA:masterfrom
Conversation
…g.hashCode Code was taken from a response on StackOverflow: http://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript-jquery This cut indexing time by 50% and index size by 20%. Nothing is free though, and insert time is increased by 25%. Inserting 1000 documents is till 6x faster than indexing them.
TL;DR This reduces both the number of views, reducing size on disk and view server processing time, and in some cases the number of trips to the view server. This is a bit of a novel because it is a squash of three well researched, but minor, commits. Remove reduce functions which only returned null In the description of Lookup Views on the wiki, they specifically mention that null is an appropriate emit value for a view that will only be used for lookup. I believe that having a reduce function defined for such a view will require roundtrips to the view server, but won't accomplish any actual work. On the other hand, not having a function will automatically default the view to ignore the reduce function, at least for query side (but hopefully for generation as well) as described in the query options section. http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Lookup_Views http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options Use built-in _sum function where appropriate This should eliminate trips to the view server by doing summation in the erlang process. Documentation from couchapp (erica's predecessor) describes the simple file with '_sum' as the correct way to do this for a couchapp. Several sources reference the increased efficiency of using the build-in function as compared to the same javascript function. http://couchapp.org/page/faq http://wiki.apache.org/couchdb/Built-In_Reduce_Functions http://docs.couchdb.org/en/latest/ddocs.html http://nosql.mypopescu.com/post/773435732/couchdb-built-in-reduce-functions Replacing reports-per views with recent-items where possible - reports-per-android-version => recent-items-by-androidver - reports-per-app-version-code => recent-items-by-appver - reports-per-app-version-name => recent-items-by-appvercode - recent-items-by-bug => recent-items-by-bug-by-installation-id For these views, and also for recent-items and recent-items-by-installation-id, no longer capturing the summary data in the view value. Instead, Acralyzer will use the include_docs option to get the needed data in the browser.
- makes document storage about one-third smaller - makes viewindexing about one-quarter faster
Author
|
Here's the larger set of changes we've made on our acra-storage deployment. Through some poor decisions about reporting, we ended up with a VERY large set of crash reports -- millions. These changes greatly improved the size and speed of our couch instance. Indexing was an order of magnitude faster. I had been waiting to see if the initial change that keys bugs by hash code was accepted, but thinking about it more, it seems better to get the full range of changes in the open. With the full picture, any discussion of individual changes might be easier. |
Member
|
I'm sorry for coming late here, time is hard to find these days. Thanks a lot for your contribution, I'm starting to review your proposed changes. |
don796864
approved these changes
May 8, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The hash algorithm is based on Java's String.hashCode
Code was taken from a response on StackOverflow:
http://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript-jquery
This cut indexing time by 50% and index size by 20%.
Nothing is free though, and insert time is increased by 25%.
Inserting 1000 documents is still 6x faster than indexing them.