Thursday, March 22, 2012

At what point is schema information loaded?

I am hoping that someone with a good knowledge of the internals of
SQLServer will be able to help with the following - admittedly bizarre
- question (assume SQLServer 2000 throughout):
We have two SQLServer databases, dbmain and dbslave. An application
calls stored procedures in dbmain, some of which in turn access tables
in dbslave, ie we have something like this:
create procedure some_sp_in_dbmain_called_by_app
as
select * from dbslave.dbo.someTable
Significantly, dbslave is only ever accessed in this way, ie via
dbmain.
The question is:
At what point does the SQLServer engine load information about dbslave
from disk? Presumably in order to fulfil the select query in the
example it needs to have schema information for dbslave at that point,
but is this information loaded on demand at the point of the call, or
is information on all databases loaded when the server is started?
I should explain that this question is motivated by a legal case I am
working on, where it is important that we establish the point at which
any part of dbslave is copied into RAM from disk. For this reason, any
links to pertinent documentation or papers would be particularly
welcome.
TIA for any help.
Some but generally not all the meta-data is loaded when the database is
attached when the sql server service starts - generally when the machine
boots. Given read-ahead, recovery, and shared extents it's pretty much
impossible to predict when a given piece of meta-data is loaded into memory.
If there are a lot of transactions that need to be recovered, a significant
portion of the schema information might be loaded at startup.
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
<dduv@.hotmail.com> wrote in message
news:1177016287.857608.294980@.q75g2000hsh.googlegr oups.com...
> I am hoping that someone with a good knowledge of the internals of
> SQLServer will be able to help with the following - admittedly bizarre
> - question (assume SQLServer 2000 throughout):
> We have two SQLServer databases, dbmain and dbslave. An application
> calls stored procedures in dbmain, some of which in turn access tables
> in dbslave, ie we have something like this:
> create procedure some_sp_in_dbmain_called_by_app
> as
> select * from dbslave.dbo.someTable
> Significantly, dbslave is only ever accessed in this way, ie via
> dbmain.
> The question is:
> At what point does the SQLServer engine load information about dbslave
> from disk? Presumably in order to fulfil the select query in the
> example it needs to have schema information for dbslave at that point,
> but is this information loaded on demand at the point of the call, or
> is information on all databases loaded when the server is started?
> I should explain that this question is motivated by a legal case I am
> working on, where it is important that we establish the point at which
> any part of dbslave is copied into RAM from disk. For this reason, any
> links to pertinent documentation or papers would be particularly
> welcome.
> TIA for any help.
>
|||On 19 Apr, 22:33, "Roger Wolter[MSFT]" <rwol...@.online.microsoft.com>
wrote:
> Some but generally not all the meta-data is loaded when the database is
> attached when the sql server service starts - generally when the machine
> boots. Given read-ahead, recovery, and shared extents it's pretty much
> impossible to predict when a given piece of meta-data is loaded into memory.
> If there are a lot of transactions that need to be recovered, a significant
> portion of the schema information might be loaded at startup.
>
Roger,
Thanks very much for your reply. I understand that there isn't
necessarily a simple answer to this, but if I may, let me see if I can
paraphrase, assuming a simple startup scenario with no recovery:
= When you start the sql server service, it attaches your databases
(including master and (from my example) dbmain and dbslave;
= For each database attached (still at startup time here) sql server
reads *some* of the data from *some* of the sys- tables and uses this
to build an internal in-memory representation of the database's schema
(btw I'm guessing that master is in some senses a special case, but I
am less interested in this).
If I have this correct, is it possible to qualify the *some*s any
further? For example, will it construct a representation of all tables
using data from sysobjects et al, but delay examining stored procedure
code from syscomments until they are actually accessed, something of
that sort? Or is the algorithm more subtle than this?
Many thanks again.
|||On 19 Apr, 22:33, "Roger Wolter[MSFT]" <rwol...@.online.microsoft.com>
wrote:
> Some but generally not all the meta-data is loaded when the database is
> attached when the sql server service starts - generally when the machine
> boots. Given read-ahead, recovery, and shared extents it's pretty much
> impossible to predict when a given piece of meta-data is loaded into memory.
> If there are a lot of transactions that need to be recovered, a significant
> portion of the schema information might be loaded at startup.
>
Roger,
Thanks very much for your reply. I understand that there isn't
necessarily a simple answer to this, but if I may, let me see if I can
paraphrase, assuming a simple startup scenario with no recovery:
= When you start the sql server service, it attaches your databases
(including master and (from my example) dbmain and dbslave;
= For each database attached (still at startup time here) sql server
reads *some* of the data from *some* of the sys- tables and uses this
to build an internal in-memory representation of the database's schema
(btw I'm guessing that master is in some senses a special case, but I
am less interested in this).
If I have this correct, is it possible to qualify the *some*s any
further? For example, will it construct a representation of all tables
using data from sysobjects et al, but delay examining stored procedure
code from syscomments until they are actually accessed, something of
that sort? Or is the algorithm more subtle than this?
Many thanks again.
|||On 19 Apr, 22:33, "Roger Wolter[MSFT]" <rwol...@.online.microsoft.com>
wrote:
> Some but generally not all the meta-data is loaded when the database is
> attached when the sql server service starts - generally when the machine
> boots. Given read-ahead, recovery, and shared extents it's pretty much
> impossible to predict when a given piece of meta-data is loaded into memory.
> If there are a lot of transactions that need to be recovered, a significant
> portion of the schema information might be loaded at startup.
>
Roger,
Thanks very much for your reply. I understand that there isn't
necessarily a simple answer to this, but if I may, let me see if I can
paraphrase, assuming a simple startup scenario with no recovery:
= When you start the sql server service, it attaches your databases
(including master and (from my example) dbmain and dbslave;
= For each database attached (still at startup time here) sql server
reads *some* of the data from *some* of the sys- tables and uses this
to build an internal in-memory representation of the database's schema
(btw I'm guessing that master is in some senses a special case, but I
am less interested in this).
If I have this correct, is it possible to qualify the *some*s any
further? For example, will it construct a representation of all tables
using data from sysobjects et al, but delay examining stored procedure
code from syscomments until they are actually accessed, something of
that sort? Or is the algorithm more subtle than this?
Many thanks again.
|||Profuse apologies for the triplicate post - Google groups appears to
be a bit skittish today ...
|||First, you can't ignore recovery - a database goes through recovery every
time you start it. Second, there's no way to determine when a piece of
schema get loaded into memory. If a page from sysobjects is read to access
a table, it might have information on dozens of other tables on the page.
SQL also does read-ahead so accessing one page might cause a dozen to be
read into memory. I can't see what possible use this would be to anybody
anyway. Whether data is one disk or in memory is a pretty arbitrary
distinction that may be different every time a database starts. You could
detach the database when you shut down and only attach it when you need it
but that seems rather silly.
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
<dduv@.hotmail.com> wrote in message
news:1177076014.570168.111420@.p77g2000hsh.googlegr oups.com...
> On 19 Apr, 22:33, "Roger Wolter[MSFT]" <rwol...@.online.microsoft.com>
> wrote:
> Roger,
> Thanks very much for your reply. I understand that there isn't
> necessarily a simple answer to this, but if I may, let me see if I can
> paraphrase, assuming a simple startup scenario with no recovery:
> = When you start the sql server service, it attaches your databases
> (including master and (from my example) dbmain and dbslave;
> = For each database attached (still at startup time here) sql server
> reads *some* of the data from *some* of the sys- tables and uses this
> to build an internal in-memory representation of the database's schema
> (btw I'm guessing that master is in some senses a special case, but I
> am less interested in this).
> If I have this correct, is it possible to qualify the *some*s any
> further? For example, will it construct a representation of all tables
> using data from sysobjects et al, but delay examining stored procedure
> code from syscomments until they are actually accessed, something of
> that sort? Or is the algorithm more subtle than this?
> Many thanks again.
>
|||On 20 Apr, 15:50, "Roger Wolter[MSFT]" <rwol...@.online.microsoft.com>
wrote:
> First, you can't ignore recovery - a database goes through recovery every
> time you start it.
OK, I hadn't appreciated that.

> Second, there's no way to determine when a piece of
> schema get loaded into memory. If a page from sysobjects is read to access
> a table, it might have information on dozens of other tables on the page.
> SQL also does read-ahead so accessing one page might cause a dozen to be
> read into memory.
Of course. I should have realised this, and this is actually the
clincher.

> I can't see what possible use this would be to anybody
> anyway. Whether data is one disk or in memory is a pretty arbitrary
> distinction that may be different every time a database starts. You could
> detach the database when you shut down and only attach it when you need it
> but that seems rather silly.
I agree that the whole line of inquiry seems bizarre, but we are
labouring under a very precise definition of "copying" that does not
really apply very naturally to software, especially "meta-software"
such as a database schema, and the intention was to try and scope just
how much of the schema is "copied" from disk to memory. But from your
explanations above I can now see precisely how my original question
does not have a sensibly deterministic answer, so that's actually a
big help in and of itself. Many thanks for your help.

No comments:

Post a Comment