How to retrieve a column value based on another column value in to a variable

extract column value based on another column python
create pandas column with new values based on values in other columns
pandas column value based on another column
pandas replace values in column based on multiple condition
pandas set column value based on condition of another column
pandas dataframe get cell value by condition
pandas replace values based on condition
pandas add a column based on another column value

I'm new to scala programming. I have a usecase to retrieve a column value in to a variable based on another column value in a dataframe

This is on scala.

I have the following data frame

I need to get the value of the column location in to a variable based on column name passed in. i.e. if the passed in name is 'xxx' I need the value 'India' in to a variable from the data frame.

Assuming, the value that is passed is unique to the dataframe otherwise multiple rows will be returned and you've to handle other way. Here is the way how you can solve it:

scala> import spark.implicits._
import spark.implicits._

scala> val df = Seq(("XXX",34, "India"), ("YYY", 42, "China"), ("ZZZ", 36, "America")).toDF("name", "age", "location")
scala> df.show()
+----+---+--------+
|name|age|location|
+----+---+--------+
| XXX| 34|   India|
| YYY| 42|   China|
| ZZZ| 36| America|
+----+---+--------+
scala> val input = "XXX"
input: String = XXX
scala> val location = df.filter(s"name = '$input'").select("location").collect()(0).getString(0)
location: String = India

Hopefully that will solve your requirement....

extract column value based on another column pandas , To get the series that satisfy our condition use loc and then to get the first element use iloc: In [2]: df. Out[2]:. A B. 0 p1 1. 1 p1 2. 2 p3 3. 3 p2 4. In this post, we’ll learn how to add up a column of numbers based on the values in another column. For example, we are trying to analyze product sales based on average customer rating. That is, customers rate our products on a scale of 1 to 10, and so each product has an average rating such as 9.8, 7.2, 6.1, and so on.

If I really understand what you mean it's just a filter and select the corresponding value of location. The follow code are an example

import org.apache.spark.sql.catalyst.encoders.RowEncoder
import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.types.DataTypes._
import org.apache.spark.sql.types.{StructField, StructType}
import org.apache.spark.sql.functions.col
import org.scalatest.FunSuite

class FilterTest extends FunSuite {

  test("filter test") {

    val spark = SparkSession.builder()
      .master("local")
      .appName("filter test")
      .getOrCreate()

    val schema = StructType(
      Seq(
        StructField("name", StringType, true),
        StructField("age", IntegerType, true),
        StructField("location", StringType, true)
      )
    )

    val data = Seq(
      Row("XXX", 34, "India"),
      Row("YYY", 42, "China"),
      Row("ZZZ", 36, "America")
    )

    val dataset = spark.createDataset(data)(RowEncoder(schema))
    val value = dataset.filter(col("name") === "XXX").first().getAs[String]("location")
    assert(value == "India")
  }
}

extract column value based on another column , I am kind of getting stuck on extracting value of one variable conditioning on another variable For example the following dataframeA Bp I am kind of getting stuck on extracting value of one variable conditioning on another variable. For example, the following dataframe: A B p1 1 p1 2 p3 3 p2 4 How can I get the value of A when B=3? Every time when I extracted the value of A, I got an object, not a string.

You can use filter to get row where column name value is xxx. Once you have row you can display any column of that row.

var filteredRows = dataFrame.filter(row => {
    row.get(0).equals("XXX")
})
filteredRows.rdd.first().get(2)

Python, How do I get the value of a column in a data frame? DAX formula to retreive a value of a column based on a date field in another column

Pandas GroupBy: Your Guide to Grouping Data in Python – Real , How do you select rows of pandas DataFrame using multiple conditions? Just do the following steps: #1 select the text values in Column A (A1:A6), press Ctrl +C to copy these values, and paste into another blank column (Column D). #2 keep the pasted values in Column D selected, go to DATA tab, click Remove Duplicates command under Data Tools group.

Pandas DataFrame: isin() function, () and pass the name of the column you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to . Besides, you can vlookup and get the adjacent cell value based on a cell reference. For example, the referenced cell is C2 which locates value 365, for vlookup this value and return the value in adjacent cell. You just need to change the above formula to below, and then press the Enterkey.

How to extract column elements based on the value contained in , () function is used to check each element in the DataFrame is contained in values or not. The result will only be true at a location if all the labels match. If values is a Series, that's the index. If values is a dict, the keys must be the column names, which must match. Hi Kawser Have trouble in retrieving information from 3 excel, with 3 same sheet names. In 1 excel – sheet 3 is where formula is to go, reference by name is in column A, sheet 1 is where to retrieve information from, Column A is name, Column B is date, Column C is Distance – so on across 20 columns.

Comments
  • I'm not really sure what you mean. It will help a lot if you can put an exmaple here.