Tag Archives: Microsoft SQL

SQL Server 2008: The BIDS Interface and Components

SQL Server Integration Services is not the only member of the Microsoft Business Intelligence suite to use BIDS for development. Analysis Services and Reporting Services developers also use BIDS to create their work. Using project templates supplied by Microsoft, Analysis Services developers build cubes, data mining models, and other analytical objects, while report developers build reports and report models in BIDS.

BIDS supports any or all of these project types within the same solution, as
shown in Figure A.

bids1

Figure A. The BIDS Solution Explorer window showing multiple SQL Business Intelligence project types.

TIP: You can create as many Integration Services packages as you want in a single solution. Modularizing a complex ETL workflow into multiple packages makes it easier to develop and debug the individual components. When all the packages are working correctly, you can group them to execute all at once using an Execute Package task.

Creating a new Integration Services project—rather than just opening a package as a standalone file—exposes some additional sections of the interface, such as access to the Data Sources and Data Source Views folders in Solution Explorer, as you can see in the Integration Services Project in Figure A.

Items that you add to a Project are visible in the Solution Explorer window, and can be accessible to any part of the project.

  • Data Sources contain a connection string, and are available in Solution Explorer in the Data Sources folder. They are created and maintained at the project level, outside of any package definition, but can be referenced by a package’s Connection Manager. Connection Managers depend on Data Sources in Integration Services and can be used to share connection definitions among multiple packages in the same solution.

TIP: When you use Data Sources, you can change the data connection strings for multiple packages at once. Adding Connection Managers directly to a package creates local data links, which you must edit per package.

  • Data Source Views are based on Data Sources, but support further filtering of the database schema. By creating package objects based on Data Source Views, you streamline working with lists of database objects.
  • The Miscellaneous folder can contain any support files you need, such
    as documentation, flat files that contain data for import, etc.

You access Integration Services functions in BIDS or Visual Studio through the SSIS menu, the tabbed designer windows, and their related toolboxes:

  • SSIS Menu: Choose options for setting up package logging, configurations, variables, and other options, as shown in Figure B.

bids2

Figure 2. The SSIS menu.

  • Tabbed Designer: Lay out the logic of the package by dragging tasks from the toolbox to the designer to control the overall flow of processing steps. For example, you could download files via FTP before importing them into a database table.
  • Toolboxes: The toolboxes associated with some of the designer tabs let you drag and drop components that perform the many potential actions of a complete Integration Services solution. If you are familiar with Web or Windows forms development in .NET, you’ll be right at home with adding components to a package.

In this course, you will learn about the BIDS tools that enable you to build powerful ETL solutions in Integration Services. The major components of the BIDS interface include:

  • Control Flow designer
  • Data Flow designer
  • Connection Managers
  • Event Handler designer
  • Package Explorer
  • Progress pane


ldn-expertdkiely

This post is an excerpt from the online courseware for our Microsoft SQL Server 2008 Integration Services course written by expert Don Kiely. 

Don Kiely is a featured instructor on many of our SQL Server and Visual Studio courses. He is a nationally recognized author, instructor and consultant who travels the country sharing his expertise in SQL Server and security.

Microsoft SQL 2012 Developer: Creating Partitioned Tables

SQL Server 2012 includes the T-SQL extensions to allow for partitioned tables and indexes. Here are the specific steps you perform to create and maintain these database objects.

To create a partitioned table, complete the following tasks:

  1. Select an appropriate partition key.
  2. Create a partition function.
  3. Create a partition scheme.
  4. Create the table and identify its partition scheme.
Key Terms
Partition key A single column in a table whose values determine the partition on which the data resides.
Partition function A function that specifies how to partition the table by defining partition key boundaries for each subset of data.
Partition scheme A mapping of individual partitions to filegroups.

Creating a Partition Function

With a partition key selected, the next step is to define the partitions. The CREATE PARTITION FUNCTION statement specifies how to divide the table by identifying specific boundaries for each partition. The basic syntax of the CREATE PARTITION FUNCTION statement is:

Note: the following syntax details: The input_parameter_type specifies the data type of the partition key and not the name of the partitioning column. The LEFT and RIGHT keywords specify the side of a partition boundary on which a boundary_value belongs. LEFT is the default if you do not explicitly state a side.

LEFT and RIGHT Boundaries

The best way to understand the difference between LEFT and RIGHT partition boundaries is to look at some examples. Consider the following two partition functions:

 

Table below shows the specific partition boundaries for both LEFT and RIGHT versions of the partition function.

Partition Column Values (LEFT) Column Values (RIGHT)
1 <= 100 < 100
2 > 100 AND <= 1,000 >= 100 AND < 1,000
3 > 1,000 AND <= 10,000 >= 1,000 AND < 10,000
4 > 10,000 AND <= 100,000 >= 10,000 AND < 100,000
5 > 100,000 >= 100,000

The table above Partition range values defined according to LEFT and RIGHT boundaries.

A partition function that you define using LEFT boundaries indicates that the boundary value exists in the partition to the value’s left. Similarly, RIGHT boundaries place values in the partition to the value’s right.

Because all partition key values need to map to a partition, SQL Server actually creates n+1 partitions for every partition function. Notice that both functions specify only four boundary values. However, values less than the minimum and greater than the maximum are part of the entire domain of values and therefore must map to a partition.

TIP: If you need to specify acceptable minimum or maximum values, apply a CHECK constraint on the table to reject inappropriate partition key values. For example, to prevent partition key values greater than 100,000, define the partition function by using RANGE LEFT and placing a CHECK constraint on the table to reject values greater than 100,000.

In addition, a CHECK constraint on the table effectively leaves either the leftmost or the rightmost partition empty—a benefit if you add or remove partitions later.

Although not ideal, SQL Server allows partition key columns to contain NULL values and handles them using these rules:

By default, all records that have a partition key value of NULL exist on the leftmost partition.
If a partition function contains a NULL boundary value and is defined using RANGE RIGHT, the leftmost partition remains empty and rows that contain NULL values exist on the next partition.

Keep these considerations in mind when you design partition functions.

Frank TillinghastFrank Tillinghast is a senior consultant with MTOW Software Solutions. He is a Microsoft Certified Solution Developer and has been developing applications for over fifteen years. Most of his time is spent consulting for companies nationwide with troubled projects or mentoring projects to successful completion. When he is not developing software or mentoring clients, Frank is teaching other developers. He has taught thousands of application developers how to create business solutions with Visual Studio .NET. VB.NET, ASP.NET, Visual C#, and SQL Server.

User-Defined Function Overview – Microsoft SQL

User-defined functions (UDFs) were introduced in SQL Server 2000 to provide the coding functionality similar to what is available in general-purpose programming languages. UDFs provide a level of flexibility and reusability that would be impossible to duplicate by using only stored procedures, triggers, and views. These functions, like those in other programming languages, provide reusable code modules that you can invoke with parameters, perform some kind of calculations or actions, and return various
kinds of data.

SQL Server supports three kinds of user-defined functions, each with its own specific features and benefits:

  • Scalar functions are what most developers think of as user-defined functions, because they are most like the functions available in other programming languages and the built-in Transact-SQL (T-SQL) functions. A scalar function can take parameters, contain complex processing, and return a single value of almost any SQL data type.
  • Inline table-valued functions are similar to views in that they each consist of a single SELECT statement and each returns a table object. However, inline table-valued functions can take parameters, whereas views cannot. Like views, you can use inline table-valued functions in statements that update, insert, or delete data, and you can join to them. An inline table-valued function is essentially a view with parameters.
  • Multi-statement table-valued functions are similar to stored procedures, but they return an in-memory table object (a set of rows and columns that you can select from) instead of a static result set. This makes them a powerful view/stored procedure hybrid, because you can perform complex processing in them as in a stored procedure and select data from them as in a view. However, the tables that these functions return are not updatable.

TIP: Another way to categorize the types of functions available in SQL Server is
as scalar and table-valued, based on the type of data returned. With that
categorization, there are then two types of table-valued functions.

UDFs are not allowed to have side effects, which is a bit of a limitation on their use. This means that they are not allowed to make any lasting changes in the database, such as to make changes to data or to alter the schema of any object. But there are more subtle actions that cause changes and therefore are not allowed in UDFs. For example, the RAND function, which generates random numbers, retains state information about the last random number it generated, which changes the state of the function and requires SQL Server to store that state somewhere.

User-defined functions combine some of the best elements of stored procedures and views. Many tasks that you previously performed by using views and stored procedures you can now handle better with UDFs. You can invoke these functions from views, stored procedures, other user-defined functions, batches, and scripts, as well as from external applications.

NOTE:  SQL Server 2000 did not permit nondeterministic functions in the body of a user-defined function. A deterministic function, like LEN, always returns the same result anytime it is called with the same argument values. A nondeterministic function, such as GETDATE(), always returns a different value on each invocation. However, in SQL Server 2005 and later, things have loosened up: GETDATE() is now permissible in UDFs.

Common Language Runtime UDFs

You can create each type of UDF in SQLCLR code as well as Transact-SQL. Transact-SQL is recommended for functions that rely on set-related data manipulation and SQLCLR functions are recommended for functions that require intensive computations or string manipulations, neither of which T-SQL is well suited for. In some cases, the .NET Framework contains base classes that supply functionality that would be difficult to duplicate in Transact-SQL, such as string processing that uses the RegEx class. However, if your functions involve fairly straightforward data manipulation, Transact-SQL is the best choice. This chapter will focus on T-SQL UDFs.

ldn-expertdkielyThis post is an excerpt from the online courseware for our Microsoft SQL Server 2012 Developer course written by expert Don Kiely.

Don Kiely is a featured instructor on many of our SQL Server and Visual Studio courses. He is a nationally recognized author, instructor and consultant who travels the country sharing his expertise in SQL Server and security.

Types of Spatial Data

SQL Server includes two spatial data types—Geometry and Geography—that encapsulate the database’s support for spatial data. Each data type includes pretty much the same point, line, shape, and collection objects, as well as rich sets of methods and properties for working with these shapes. Figure below shows the hierarchy of available objects. The way this works is that you define a field or variable as either the Geometry or Geography data type, then assign to that field or variable data that describes the point, line, shape, or collection you want to store. The yellow/lighter objects shown in the figure are abstract objects, while the blue/darker objects are concrete objects that you can store in the spatial data type. The only difference between the Geometry and Geography object hierarchies is that in Geography there is a FullGlobe object that represents the complete surface of the earth. This is shown in Figure below in a lighter color and dashed line because it is not part of the Geometry object hierarchy.
TypeofSpatialimg1

The shapes defined by the various spatial objects are vector objects, which are collections of points, lines, curves, polygons, and compound curves. You’re likely to use the following spatial objects most often to store and manipulate within SQL Server:

  • Point: Identified by a pair of coordinates as an exact location as X and Y coordinates, but can also include Z (elevation) and M (measure). SQL Server doesn’t use the Z and M coordinates, but lets you store the data so that client applications can use them.
  • LineString: A path along a sequence of points. It is a one-dimensional shape: it has length but no area, even if multiple connected LineStrings look like a closed shape. The LineString is simple if it doesn’t cross itself, or a ring if the starting point is the same as the ending point.
  • CircularString: Similar to a LineString, but defines a curve as a portion of a circle instead of a straight line.
  • Polygon: A closed 2-dimensional shape; a ring. A polygon has both a length property and an area. It can have holes in its interior, the shapes of which are defined by other, smaller polygons. The area of a hole is excluded from the area of the outer polygon.

You can also create collections of these basic shapes, including a MultiPoint, MultiLineString, and MultiPolygon, each of which can contain only the specified shapes. If you need a collection of any type of shape, including the multi-shapes, you can use a GeomCollection. This makes it easy to work with groups of shapes as a single unit.

The Geography and Geometry data types are .NET classes implemented as system SQLCLR objects with several dozens of methods. Many of the methods, about two-thirds, have names that start with the prefix ST, such as STArea, STLength, STDistance, and STGeomFromText. These ST methods are implementations of Open Geophysical Consortium standard methods (the ST stands for spatiotemporal). These methods provide the most commonly used, basic features for working with spatial data.

Microsoft also added some custom extension methods to the objects to support features not part of the OGC standards. These method names don’t have the ST prefix, such as Parse, ToString, and Reduce.

The nice thing about the Geography and Geometry data types is that they share a lot of methods that work the same across the two types. The main difference is that with the Geography type you’ll use latitude and longitude coordinates but with the Geometry type you’ll generally use x and y coordinates.
ldn-expertdkielyThis post is an excerpt from the online courseware for our Microsoft SQL Server 2012 Developer course written by expert Don Kiely. 

Don Kiely is a featured instructor on many of our SQL Server and Visual Studio courses. He is a nationally recognized author, instructor and consultant who travels the country sharing his expertise in SQL Server and security.