The PDFLegal Main Form

We haven’t done anything with the Form1 that was automatically created, so let’s rename it to MainForm. The first thing I do with forms is to change the font from the 8.25 pt Microsoft San Serif or something to Segoe UI 9 pts. That way, everything else put into the form inherits that font.

Let’s drop a status bar and a toolbar on the form. You’ll notice they position themselves automatically. Now drop a panel into the form and set it to dock in the parent container. This will make the panel expand to take up all the unused form space between the toolbar at the top and the status bar at the bottom.

Now open the Data Sources tab, expand DataSet1, and drag Files onto the form. This will create a data grid view and a binding source navigator, as well as data set and binding source components. Delete the binding source navigator. If you now dock the data grid view in its parent (which is the panel), it will expand to fill the screen.

Set the data grid view up the way you want. I like to remove the ability to add and remove rows, since the application won’t need them, and I like to set the selector to full row select. But I don’t think it matters for this application. Here’s what my form looks like, except I temporarily made it very short just for the snapshot.

For the code to work, we’ll need the following using statements.

using System;
using System.Data;
using System.IO;
using System.Windows.Forms;
using iText.Kernel.Pdf;
using Microsoft.WindowsAPICodePack.Dialogs;

Here’s the code behind the button, which I called toolLoadFolder. We start by creating and showing the folder picker, feeding it whatever text is in the text box. If the text is a valid path, the folder picker will open to that path. The good thing is that, if the text is not a valid path, the folder picker just ignores it.

private void toolLoadFolder_Click(object sender, EventArgs e)
{
    var dialog = new CommonOpenFileDialog();
    dialog.IsFolderPicker = true;
    dialog.InitialDirectory = toolFolderName.Text;
    if (dialog.ShowDialog() == CommonFileDialogResult.Ok)
    {
        toolFolderName.Text = dialog.FileName;
        dataSet1.Files.Clear();
        dataSet1.Files.RecIDColumn.AutoIncrementStep = -1;
        dataSet1.Files.RecIDColumn.AutoIncrementSeed = -1;
        dataSet1.Files.RecIDColumn.AutoIncrementStep = 1;
        dataSet1.Files.RecIDColumn.AutoIncrementSeed = 1;
        DirectoryInfo ds = new DirectoryInfo(toolFolderName.Text);
        foreach (FileInfo fi in ds.EnumerateFiles("*.pdf"))
        {
            DataRow dr = dataSet1.Files.NewRow();
            dr.SetField(dataSet1.Files.FileNameColumn, fi.Name);
            dr.SetField(dataSet1.Files.FolderNameColumn, fi.DirectoryName);
            dr.SetField(dataSet1.Files.FullNameColumn, fi.FullName);
            dr.SetField(dataSet1.Files.PageCountColumn, 0);
            dataSet1.Files.AddFilesRow(dr as DataSet1.FilesRow);
        }
    }
    dialog.Dispose();
}

After selecting a folder, we clear the data table and reset the autoincrement. Then we loop through the PDFs in the folder, storing their information in the data table. As you can see, the code is not complicated at all.

Run the code to make sure it works. Select a folder with PDFs. You should see the list; the Pages column, of course, will have all zeroes because we have not yet calculated the number of pages in the PDFs.

The first thing to do is to create a PDFDocument object. The constructor takes various arguments, among them a PDFReader, a PDFWriter, or both. If you’re creating PDFs from scratch (which we’re not doing here, but just so you know), you create a PDFDocument with a PDFWriter object. If the PDFWriter names an existing file, it will be erased! If you’re reading a PDFDocument (which is what we’re doing here), we give the constructor a PDFReader object.

Let’s now calculate the number of pages. Add another button to the toolbar, called toolCalcPages, and then create code for the Click event.

We loop through the data table, creating a PDFDocument object with a PDFReader that points to the full name of each PDF file in the data table. Then we use the function to retrieve the number of pages, which we store in the appropriate column.

private void toolCalcPages_Click(object sender, EventArgs e)
{
    foreach (DataRow dr in dataSet1.Files.Rows)
    {
        var pdf = new PdfDocument(
            new PdfReader(dr.Field<string>(dataSet1.Files.FullNameColumn)));
        dr.SetField(dataSet1.Files.PageCountColumn, pdf.GetNumberOfPages());
    }
}

Since the data table is bound to the data grid view, the screen will update automatically, and you’ll be able to see how many pages each PDF has!

And now we have one final task, to calculate the page ranges. Add another button to the toolbar, called toolGetRanges, and capture its Click event, using this code. I’ve formatted it for best visual appearance, but you know white space doesn’t count in C#.

private void toolGetRanges_Click(object sender, EventArgs e)
{
    using (var writer =
        new StreamWriter(toolFolderName.Text + @"\ranges.csv", true))
    {
        int firstPage = 1;
        int lastPage = 0;
        foreach (DataRow dr in dataSet1.Files)
        {
            lastPage += dr.Field<int>(dataSet1.Files.PageCountColumn);
            writer.WriteLine(
                "{0},{1},\"{2}\"",
                firstPage,
                lastPage,
                dr.Field<string>(dataSet1.Files.FileNameColumn));
            firstPage = lastPage + 1;
        }
    }
}

In the code, we create a CSV file (so you can load it into Excel) called “ranges” in the same folder where the PDFs are. Then we loop through the data table, calculating the incrementing first and last pages.

Note the format string. The first and last page are output as numbers, but the file name is output as text between quotes so that commas don’t mess things up. Also note this Excel peculiarity: there must be no space between values and commas or Excel won’t read it as proper CSV.

Here are the source files. I’m providing two versions. The first is the full version, including all the packages. It’s quite large, coming to almost 29 megabytes. If you’ve already used NUGET to install the packages, you don’t need this version. The second version is less than a megabyte, and includes only the source files, without the packages. Try this one, and use the first one only if you can’t get the second one to build successfully.

Here is the BIG zip file — PDFLegal 29 mb.zip
And here’s the small one — PDFLegalminimum.zip

Leave a Reply

Your email address will not be published. Required fields are marked *