Documents are saved to a document library without required metadata (by using windows explorer or multiple document upload) and then sit on the server, sad, dejected, and invisible to everyone but the person who created them. Add to that, they are even invisible to admins! The only way to determine if they are there is to navigate to the URL of the “manage documents with no checked in version” tool in the doc library. From that page, you can take ownership of them and then check them in.
This wouldn’t be as bad a problem if the URL weren’t based on the GUID of the library instead of just a sensible name.
Given that we are using Project Server, and the document libraries I’m interested in looking at are all named http:servername/PWA/Project Site/Project Documents, I was able to find a way to report on this by using two parts:
Part One: Finding the GUID
Back in the olden days of Project 2003, a few of my colleagues wrote a macro that would actually navigate to each Project Site and pull data out of the tables into Excel. This macro was HO ugly, but it did the trick…at least mostly. When I realized that the GUID was buried out there in the HTML, I thought to myself that if they could scrape data out of tables on a web page, I should be able to scrape data out of the code, right?
I dug around a lot, and finally found a lot of queries that went out into the HTML, pulled data based on certain tags, and returned it into your spreadsheet. This is the basic principle:
Ok, first you have to get the URLs from Project Server which you are going to use to search for your GUID. Here’s what I did so I wouldn’t have to query the content database:
First I created a pretty darn simple query which pulls the URLs for the Project Sites and adds some junk on the end:
ProjectWorkspaceInternalHRef + N’/Project Documents/’ as [PD],
ProjectWorkspaceInternalHRef + N’/_layouts/ManageCheckedOutFiles.aspx?List=%7B’ as [Check],
ProjectWorkspaceInternalHRef IS NOT NULL
Now, the pivot table could be put together in a lot of different ways, but if you’re only half-following me here, just do this and don’t ask questions – put the “check” field in your pivot table, then put in the PD field. They should be tabulated so that they both show in separate columns, side by side. The “PD” field must be on the right!
Now, you will create a function in VBA that you’re going to call later with a macro. Mine looks exactly like this:
Function GetGUID(WebPage As String, Tag As String)
Dim t, Tag2 As String, EndTag As String
Dim oHttp As Object, txt$, i&, j
‘Adjust this to suit
‘Chr(34) is a “
Tag2 = Tag & “=”
EndTag = Chr(34)
On Error Resume Next
Set oHttp = CreateObject(“MSXML2.XMLHTTP”)
If Err <> 0 Then Set oHttp = CreateObject(“MSXML.XMLHTTPRequest”)
If oHttp Is Nothing Then MsgBox “MSXML2.XMLHTTP not found”, 16, “Error”: Exit Function
On Error GoTo 0
.Open “GET”, WebPage, False
txt = .responseText
i = InStr(1, txt, Tag2, 1)
If i = 0 Then
GetGUID = Tag & ” not found”
t = Split(txt, Tag2)(1)
GetGUID = Split(t, EndTag)(0) + “%7D”
Set oHttp = Nothing
Now now, there’s no need to cry. I only first started playing around with VBA about a month ago, and I hacked this code my very own self! You don’t think I’d leave you alone with it, do you?
The important things to note are that in the first line, it actually references the macro you’re going to use later to make this sucker run, so ignore that for now. Let’s look at the crazy “Tag2 =” crud first.
Tag2 = Tag & “=”
This is going to be part of your return string and it’s going to help the macro know what to search for in the HTML code. It’s the tag you’ll search for (like Title, or Body etc) followed by the = sign. The “EndTag” portion is just a ” in crazytalk.
So it’s saying search for HTMLTAG= blabitty blah blah” (most have a starting ” but the place we’re getting the GUID from doesn’t)
Ok, now ignore the rest of this code until you get to the ELSE:
GetGUID = Split(t, EndTag)(0) + “%7D”
So that’s saying add the thing from above (the value you want taken out of all the junk you don’t want) all the way to the “, and then add %7D.
What the heck is a %7D, you might ask?
Well, those URLs for managing the checked out docs end in %7D for some reason. This means we have less crud to do in Excel when we construct those URLs.
Ok, now you have your Function, let’s write the Macro. Here’s what mine actually looks like:
Do While IsEmpty(ActiveCell.Offset(0, -1)) = False
ActiveCell.Value = GetGUID(ActiveCell.Offset(0, -1).Value, “aspx?List”)
So, the aspx?List is what shows up right before the GUID, see?
For me D3 is the first cell to the right of my list of URLs. You can change that as you need. The next line will tell it the name of the “tag” to search, and where to put that data when it gets it from the function. Then it goes to the next line and starts over.
I’m going to warn you now that this is a bit slow. It chugs along for a bit, so I’d recommend trying it out on only a few ULRs first.
Once you’ve pulled in the GUIDS, put a little formula in the next cell over to create the URL of your “manage documents…” location – all you need is:
=A3&C3 (where A3 is the “check” field and C3 is the GUID you just pulled)
Ok, that’s your url. Technically, you could just click on those and check the libraries individually, but that seems like a lot of work when you’re almost there!
Tomorrow I’ll show you how to alter that original function to pull the info from the “manage documents…” screen so that you can know if there are any checked out documents before you go clicking on stuff!