I was looking at potential concurrency issues in DB so i went to read up. I found http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005267.htm and it mentions access to uncommitted data.
Access to uncommitted data. Application A might update a value in the database, and application B might read that value before it was committed. Then, if the value of A is not later committed, but backed out, the calculations performed by B are based on uncommitted (and presumably invalid) data.
What... i thought other sessions (same app and even same thread) can read data that has not been committed yet? I thought only the connection/session (i am not sure of my terminology) that wrote the data into the uncommitted transaction can read uncommitted data.
Can other threads really read data that hasnt been committed? I plan to use mysql but i may use sqlite
What other sessions can read depends on how you set up your database. In MySQL it also depends on what database engine you use. The term you're looking for (in ANSI SQL terms) is "isolation level".
Many databases will default to an isolation level where reads on uncommitted data will block. So if transaction A updates record 1234 in table T and then transaction B tries to select record 1234 before A commits or rolls back then B will block until A does one of those things.
One serious downside of this is that batch update operations that live in long-running transactions (typically) can potentially block many requests.
You can also set it so B will see uncommitted data but that is often ill-advised.
Alternatively you can use a scheme called MVCC ("Multiversion concurrency control"), which will give different transactions a consistent view of the data based on the time the transaction started. This avoids the uncommitted read problem (reading data that may be rolled back) and is much more scalable, especially in the context of long-lived transactions.
Certainly in SQL Server you can, you have to chose to do it, it is not the default, but if you use the right isolation level or query hint you can chose to read an uncommitted row, this can leads to problems and even a double read of the same row in theory.
That article mentions access to uncommitted data as one of the problems eliminated by the database manager.
The database manager controls this access to prevent undesirable effects, such as:
- Access to uncommitted data.
MySQL's InnoDB storage engine supports several transaction isolation levels. For details, see http://dev.mysql.com/doc/refman/5.4/en/set-transaction.html.
For some versions of some databases, setting queries to be able to read uncommitted will improve performance, because of reduced locking. That still leaves questions of security, reliability, and scalability to be answered.
To give a specific, I used to work on a very large e-commerce site. They used read uncommitted on reads to the store catalog, since the data was heavily accessed, infrequently changed, and not sensitive to concerns about reading uncommitted data. Any data from the catalog that was used to place an order would be re-verified anyway. This was on SQL Server 2000, which was known to have locking performance problems. On newer versions of SQL Server, the locking performance has improved, so this wouldn't be necessary.