How to find which columns don't have any data (all values are NULL)?

sql check if column is null or empty
sql query to find null columns in a table
check if any column is null
sql check if any column contains value
sql check if multiple columns are null
select all columns which are not null
sql select only columns with values
mysql select column names and values

I have several tables in a database. I would like to find which columns (in which tables) don't have any values (all NULL in a column). I the example below, the result should be

TestTable1 --> Var2
TestTable2 --> Variable1

I don't have any idea how to create this kind of query. Your help is most appreciated!

--create first table
create table dbo.TestTable1 (
sur_id int identity(1,1) not null primary key,
var1 int null,
var2 int null
)
go

--insert some values
insert into dbo.TestTable1 (var1) 
    select 1 union all select 2 union all select 3

--create second table
create table dbo.TestTable2 (
sur_id int identity(1,1) not null primary key,
variable1 int null,
variable2 int null
)

--and insert some values
insert into dbo.TestTable2 (variable2) 
    select 1 union all select 2 union all select 3

For a single column, count(ColumnName) returns the number of rows where ColumName is not null:

select  count(TheColumn)
from    YourTable

You can generate a query for all columns. Per Martin's suggestion, you can exclude columns that cannot be null with is_nullable = 1. For example:

select  'count(' + name + ') as ' + name + ', '
from    sys.columns
where   object_id = object_id('YourTable')
        and is_nullable = 1

If the number of tables is large, you can generate a query for all tables in a similiar way. The list of all tables is in sys.tables.

How To Select Columns Using Prefix/Suffix of Column Names in , Sometimes you may be working with a larger dataframe with many columns and you don't exactly know the column names you want. However  To compare two columns in Excel row-by-row, write a usual IF formula that compares the first two cells. Enter the formula in some other column in the same row, and then copy it down to other cells by dragging the fill handle (a small square in the bottom-right corner of the selected cell). As you do this, the cursor changes to the plus sign:

Here's a script I wrote to do the same thing, it's a two-step manual process:

  1. Run this script in SSMS and select all of the rows in the Results pane:
SELECT
'SELECT
    COUNT( DISTINCT [' + COLUMN_NAME + ']) AS UniqueValues,
    ''' + TABLE_NAME + '.' + COLUMN_NAME + ''' AS ColumnName
FROM
    [' + TABLE_SCHEMA + '].[' + TABLE_NAME + ']

UNION ALL
'
FROM
    INFORMATION_SCHEMA.COLUMNS
ORDER BY
    TABLE_NAME,
    COLUMN_NAME
  1. Paste the results into a new query window. Scroll to the very bottom and remove the trailing UNION ALL statement. It will look like this:
SELECT   COUNT( DISTINCT [ModifiedByUserId]) AS UniqueValues,   'Inspections.ModifiedByUserId' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
SELECT   COUNT( DISTINCT [Notes]) AS UniqueValues,   'Inspections.Notes' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
SELECT   COUNT( DISTINCT [PublicPassword]) AS UniqueValues,   'Inspections.PublicPassword' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
SELECT   COUNT( DISTINCT [ShopId]) AS UniqueValues,   'Inspections.ShopId' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
SELECT   COUNT( DISTINCT [Status]) AS UniqueValues,   'Inspections.Status' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
SELECT   COUNT( DISTINCT [SupervisorUserId]) AS UniqueValues,   'Inspections.SupervisorUserId' AS ColumnName  FROM   [dbo].[Inspections]     UNION ALL  
  1. Run the query. It took about 6 minutes to run on a 300-column database. It will be faster or slow depending on how many indexes are being used.

Determine columns you don't need using DMV's in Power BI , I know we have talked about it again and again but a good best practice is to remove any high carnality columns you don't necessary need. I have 3 columns. One is on the first tab and the other 2 are on the second tab. I want the column on the first tab to find any matching values in the column in the second tab. If there is a match/multiple matches, I want it to grab the values in the 3 column on the second tab and return the sum.

Updated....Okay I had way too much fun with this

THe Proc accepts two parameters, the table to search & the Criteria to apply. you can pass essentially and where clause to the second parameter. I wrote the proc to interpret double quotes back to single quotes....again this was built off of the original developers concepts.

GO
/****** Object:  StoredProcedure [dbo].[SearchAllTables]    Script Date: 05/04/2011 14:29:48 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter PROC [dbo].[SearchAllTables2] ( @SEARCH_TABLE NVARCHAR(255), @CONDITION AS NVARCHAR(MAX) ) AS
BEGIN

-- Copyright © 2002 Narayana Vyas Kondreddi. All rights reserved.
-- Purpose: To search all columns of all tables for a given search string
-- Written by: Narayana Vyas Kondreddi -- Site: http://vyaskn.tripod.com
-- Tested on: SQL Server 7.0 and SQL Server 2000
-- Date modified: 28th July 2002 22:50 GMT
CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))
SET NOCOUNT ON
DECLARE @TableName nvarchar(256), @ColumnName nvarchar(128), @SearchStr2 nvarchar(110), @COND_STR NVARCHAR(MAX)
SET @TableName = '' 
--SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')
WHILE @TableName IS NOT NULL
BEGIN
SET @ColumnName = ''
SET @CONDITION = REPLACE(@CONDITION,'"','''')
SET @TableName =
(
SELECT MIN(QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME))
FROM INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE'
AND
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) > @TableName
AND OBJECTPROPERTY(
OBJECT_ID(
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
), 'IsMSShipped'
) = 0 AND TABLE_NAME = @SEARCH_TABLE
) WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL) 
BEGIN SET @ColumnName = ( 
SELECT MIN(QUOTENAME(COLUMN_NAME)) 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_SCHEMA = PARSENAME(@TableName, 2) AND 
TABLE_NAME = PARSENAME(@TableName, 1) AND QUOTENAME(COLUMN_NAME) > @ColumnName ) 
IF @ColumnName IS NOT NULL 
BEGIN 
SET @COND_STR = REPLACE(@CONDITION,'''','"')
INSERT INTO #Results 
EXEC ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''',''' + @COND_STR + ''' AS CONDITION FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' ' + @CONDITION)
PRINT ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''',''' + @COND_STR + ''' AS CONDITION FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' ' + @CONDITION)
END 
END 
END 
    SELECT ColumnName, ColumnValue 
    FROM #Results 
END
GO
-- to execute

exec [SearchAllTables2] 'TABLENAME','LIKE "%DOUG%"' -- double quotes are automatically escaped to single quotes...

Original code modified from copyright below....only using portions.

GO
/****** Object:  StoredProcedure [dbo].[SearchAllTables]    Script Date: 05/04/2011 14:29:48 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter PROC [dbo].[SearchAllTables2] ( @TABLE_NAME NVARCHAR(255) ) AS
BEGIN

-- Copyright © 2002 Narayana Vyas Kondreddi. All rights reserved.
-- Purpose: To search all columns of all tables for a given search string
-- Written by: Narayana Vyas Kondreddi -- Site: http://vyaskn.tripod.com
-- Tested on: SQL Server 7.0 and SQL Server 2000
-- Date modified: 28th July 2002 22:50 GMT
CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))
SET NOCOUNT ON
DECLARE @TableName nvarchar(256), @ColumnName nvarchar(128), @SearchStr2 nvarchar(110)
SET @TableName = '' 
--SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')
WHILE @TableName IS NOT NULL
BEGIN
SET @ColumnName = ''
SET @TableName =
(
SELECT MIN(QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME))
FROM INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE'
AND
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) > @TableName
AND OBJECTPROPERTY(
OBJECT_ID(
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
), 'IsMSShipped'
) = 0 AND TABLE_NAME = @TABLE_NAME
) WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL) 
BEGIN SET @ColumnName = ( 
SELECT MIN(QUOTENAME(COLUMN_NAME)) 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_SCHEMA = PARSENAME(@TableName, 2) AND 
TABLE_NAME = PARSENAME(@TableName, 1) AND QUOTENAME(COLUMN_NAME) > @ColumnName ) 
IF @ColumnName IS NOT NULL 
BEGIN 
INSERT INTO #Results 
EXEC ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''', ''IS NULL'' FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' IS NULL ')--LIKE ' + @SearchStr2 ) 
--PRINT ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''', ''IS NOT NULL'' FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE  ' + @ColumnName + ' IS NOT NULL ')--LIKE ' + @SearchStr2 ) 
END 
END 
END 
    SELECT ColumnName, ColumnValue 
    FROM #Results 
END
GO
-- to execute

exec [SearchAllTables2] 'Master'

Working with data in a matrix, read.csv actually has many more arguments that you may find useful when If we don't provide an index for either rows or columns, e.g. mat[, ] , R returns the  Get Column Names From Table Example 2. This Frequently asked Questions explains how to find the list of Column names in a Table using sys.columns.-- Query to Get Column Names From Table in SQL Server USE [SQL Tutorial] GO SELECT name FROM sys.columns WHERE OBJECT_ID = OBJECT_ID('NewCustomers') OUTPUT

This is way too useful not to be put into a handy little system proc - especially if you've just inherited a legacy database, and are wondering what columns you can drop or ignore.

/*
Show the count of not-null values in a table
*/
create proc sp_aaShowAllNullColumns @tableName varchar(255) as
begin
set nocount on

declare @sql nvarchar(4000)
declare @cols nvarchar(4000)
declare @tcols table( colbit nvarchar(255) )

insert @tcols
select 'count(' + name + ') as ' + name + ', ' as colbit
from    sys.columns
where   object_id = object_id(@tableName)
and is_nullable = 1 

select @cols = coalesce( @cols, ', ', '' ) + colbit from @tcols
select @cols = substring( @cols, 1, (len(@cols) - 1) )
select @cols = isnull( @cols, '' )

select @sql = 'select count(*) as Rows' + @cols + ' from ' + @tableName
exec sp_executeSql @sql

end
go

exec sys.sp_MS_marksystemobject 'sp_aaShowAllNullColumns'
go

use Bookshop
go
exec sp_aaShowAllNullColumns 'Books'
go

MySQL - How to check for a value in all columns, I am curious, is there a good way to search all columns for a given value? For my purposes, it doesn't need to be at all fast, it's just a 1-off kinda thing, and I don't  The highlighted items in column B don't match the corresponding value in column C. This rule doesn't highlight duplicates; any value in column B that contains a value not in the corresponding cell

Here's a bash script that, for all non-empty tables in an SQLite database (or for all the specified tables in such a database), identifies the all-NULL columns. The same technique can be used in the programming language of your choice assuming it can talk to an SQLite database.

#!/bin/bash

function help {
    cat <<EOF
Syntax: $0 databasefile [table ...]

If no tables are specified, then for each non-empty user table in the
specified SQLite database row, this script will emit the column names
of those columns in which all the values are NULL.  If any tables are
specified, only the specified tables are scanned.

The script is written to make it easy to modify the criteria and the output.

Thanks to SQL, two passes are required per table, and if no tables are
specified, an additional invocation of sqlite3 is required.

Column names are written in the form: tablename.columnname

Requirements:
  sqlite3 on the \$PATH

Options: 
 -v | --verbose :: emit additional information

EOF
}

while [ "$1" ]
do case "$1" in
      -h | --help | "" ) help
           exit
           ;;
      -v | --verbose ) VERBOSE=1
           shift
           ;;
      * ) break
      ;;
  esac
done

function verbose { if [ "$VERBOSE" ] ; then echo "$@" >&2 ; fi ; }

db="$1"
shift

if [ ! -s "$db" ] ; then echo "$0 : $db not found" ; exit ; fi

# To prevent loading ~/.sqliterc specify -init ""

# Global: db
function nullcolumns {
  local table="$1"
  local count column field nulls
  ( read count
    if [ -n "$count" ] ; then
        verbose "Row count for $table: $count"
    if [ "$count" -gt 0 ] ; then
          while read column ; do
          echo "SELECT '$column', * FROM
                  (SELECT COUNT(*) FROM $table WHERE '$column' IS NULL);"
          done |
          sqlite3 -readonly "$db" | while IFS="|" read field nulls ; do
            verbose $table.$field ... $nulls
            if [ "$nulls" -eq $count ] ; then echo "$table.$field" ; fi
          done
        else cat > /dev/null
        fi
    else cat > /dev/null
    fi ) < <(sqlite3 -readonly "$db" "select count(*) from '$table'; 
                                      select name from pragma_table_info( '$table' )")
}

if [ $# = 0 ] ; then
    sqlite3 -readonly "$db" .tables | while read table ; do
    nullcolumns "$table"
    done
else
    for table ; do
    nullcolumns "$table"
    done
fi

4 Subsetting, Take this short quiz to determine if you need to read this chapter. If you only know the columns you don't want, use set operations to work out which columns to  Click column B's header cell and drag the mouse pointer to the left. Don't click any borders, just drag the selection mouse pointer (the fat arrow pointing down). Release the mouse with the pointer

Returns the number of columns in a DataFrame, See Also. Other DataFrame functions: $ , $<- , select , select , select,DataFrame,​Column-method , select,DataFrame,list-method , selectExpr ; DataFrame-class  Stack Overflow Public I use it ALL the time to find ALL instances of a column name in a given database (schema). You don't have to perform joins which makes

How to use INFORMATION_SCHEMA Views in SQL Server, SQL Server INFORMATION_SCHEMA Views | See if a Table Exists source is the easiest way to determine the makeup of a table if you don't have an To Show the TABLES and COLUMNS in the database or find TABLES and COLUMNS . Click Columns. This option is below and to the left of the Layout tab. Clicking it will prompt a drop-down menu with the following options: One - Default setting for Word documents. Two - Splits the page into two distinct columns. Three - Splits the page into three distinct columns.

Selecting columns and renaming are so easy with dplyr, what if you just want to remove some columns that you know you don't need, Let's see the list of the columns first with 'colnames()' function. Reordering like this makes it very easy to find files you’ve recently accessed or modified. Adding these time stamp columns works great if you know generally what folder you’ve stored a file in, or don’t mind doing a little digging around.

Comments
  • Why not COUNT(col)? That only counts NOT NULL values.
  • @Martin: Good point, I guess I'm more comfortable with an explicit case. Edited in answer.
  • Cool - I did have one error happen if the table had 'text' as the column datatype and got around that by converting it. Case When system_type_id = 35 THEN 'count(CAST(' + name + ' as VARCHAR(MAX))) as ' + name + ', ' ELSE 'count(' + name + ') as ' + name + ', ' END
  • if you hit the error "The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information." Simply create a table with two columns and change the dynamic sql to include INSERT INTO TEMPYTABLE (COL1,COL2) at the beginning and remove the UNION ALL at the end
  • If you only want to know about all-null columns, you can remove "distinct" and it will run very much faster.
  • Good job. This is the only answer that works, and you don't need to install anything. 2 step process but saved me a load of time. It ran overnight on a like 5 million rows 1k columns.